Algebra

3.1. Morphisms of relational systems and concrete categories

For simplicity, let us focus the study on systems with only one type.

Languages. A language is a set L of "symbols", with the data of the intended arity n_s∈ℕ of each symbol s∈L. It may be

A relational language if its symbols aim to represent relations
An algebraic language if its symbols aim to represent operations.

For any language L and any set E, let

L⋆E = ∐_s∈L E^n_s

For a relational language L, an L-system is any pair (E,E) made of a set E with an L-structure E⊂L⋆E.

Most often, we shall only use one L-structure on each set, so that E can be treated as implicit, determined by E. Precisely, let us assume a fixed choice of a class of L-systems where the predicate (r,x)∈E with arguments r,x is independent of the (E,E) in this class such that (r,x)∈L⋆E (i.e. x∈E^n_r). This way, each r∈L defines an n_r-ary predicate (or class of n_r-tuples) r(x) ⇔ (r,x)∈E for any (E,E) in this class such that x∈E^n_r, so that the determination of E by E can be written

E={(r,x)∈L⋆E | r(x)}.

The interpretation of each symbol in each system will be written

r_E = {x∈E^n_r | r(x)} = ⃗E(r)
E=∐_r∈L r_E.

Morphism. Between any 2 L-systems E,F, we define the set of L-morphisms from E to F as

Mor_L(E,F) = {f ∈F^E|∀r∈L,∀x∈E^n_r, r(x)⇒ r(f०x)}
= {f ∈F^E|∀(r,x)∈E, (r,f०x)∈F}.

Introducing for any function f the function f_L = (L⋆Domf ∋(r,x) ↦ (r,f०x)), we get the shorter definitions for sets of morphisms,

Mor_L(E,F) = {f ∈F^E| f_L[E]⊂F} = {f ∈F^E| E⊂f_L*F}.

Quotient structures

Let us introduce the general concept of quotient of a relational system, before using it in a particular case.
For any relational language L, any L-system (E,E) and any equivalence relation R on E, the quotient set E/R has a natural L-structure defined as ⃗R_L[E].
It is the smallest L-structure on E/R such that ⃗R∈ Mor(E, E/R).

Extending the language by defined structures

We can see that any L-morphism f ∈Mor_L(E,F) preserves (= remains a morphism for) any further structure defined by a formula (without parameters) using symbols in L and logical symbols ∧,∨,0,1,=,∃. Indeed,

Substituting arguments of a r∈L by a map σ to n' other variables (∀E,∀x∈E^n', r'(x)⇔r(x०σ)), works :
r'(x) ⇒ r(x०σ) ⇒ r(f०x०σ) ⇒ r'(f०x).
∀r,r'∈L,n_r=n_r' ⇒ ∀x∈E^n_r, (r(x)∧r'(x)) ⇒ (r(f०x)∧r'(f०x))
∀r,r'∈L,n_r=n_r'⇒∀x∈E^n_r , (r(x)∨r'(x)) ⇒ (r(f०x)∨r'(f०x))
For 0 and 1 it is trivial
∀x,y∈E, x=y ⇒ f(x)=f(y)
∀x∈E^n_r,(∃y∈E, r(x,y)) ⇒ (∃z=f(y)∈F, r(f०x, z))

Moreover, the use of unions and intersections can be generalized from the binary cases (∨ and ∧) to arbitrary families (indexed by possibly infinite subsets of L; the empty one corresponds to 0 and 1).
Thus, if there exists an f ∈Mor_L(E,F), then any ground formula with language L using the only logical symbols (=,∧,∨,0,1,∃), that is true in E, is also true in F.

However morphisms may no more preserve structures defined with other symbols (¬,⇒,∀).

Concrete categories

The concept of concrete category is what remains of a kind of systems with their morphisms, when we forget which are the structures that the morphisms are preserving (as we saw that this structures list can be extended without affecting the sets of morphisms). Let us introduce a slightly different (more concrete) version of this concept than the one usually found elsewhere: here, a concrete category will be the data of

A class of sets called objects
A class of functions called morphisms; for any objects E,F we have a set Mor(E,F)⊂F^E of morphisms from E to F, that is the set of all functions from E to F which are morphisms

satisfying the following axioms

Every morphism belongs to some Mor(E,F), i.e. its domain is an object and its image is included in an object (in practice, images of morphisms will be objects too);
For any object E, Id_E ∈ Mor(E,E) ;
Any composite of morphisms is a morphism: for any 3 objects E,F,G , ∀f ∈ Mor(E,F), ∀g∈Mor(F,G), g०f ∈Mor(E,G).

The last condition is easily verified for L-morphisms :
∀(r,x)∈E, (r,f०x)∈F ∴ (r,g०f०x)∈G.

Other concepts of category

(Abstract) categories differ from concrete categories, by forgetting that objects are sets (ordered by inclusion) and that morphisms are functions; what remains is the sets Mor(E,F) treated as pairwise disjoint, and the composition operations, Mor(F,G)×Mor(E,F)→Mor(E,G).
A category (either concrete or abstract) is small if its class of objects is a set.

Morphisms between systems with several types

While we introduced the notion of morphism in the case of systems with a single type, it may be extended to systems with several types as well. Between systems E,F with a common list τ of types (and interpretations of a common list of structure symbols), morphisms can equivalently be conceived in the following 2 ways, apart from having to preserve all structures:

A tuple (or family) of functions (f_t)_t_∈_τ, where ∀t∈τ, f_t:E_t→ F_t where E_t⊂E, F_t ⊂F are the interpretations of type t in E anf F
A function f:E→ F that is a morphism when regarding τ as a list of unary relation symbols (by the same idea as the use of classes instead of types in set theory); or equivalently, such that h_F०f=h_E where h_E:E→τ, h_F :F→τ are the functions giving the type of each object.

3.2. Special morphisms

Let us introduce diverse possible qualifications for morphisms of relational systems, and show them related by the following dependencies (without converses in general):

Automorphism ⇔ (Endomorphism ∧ Isomorphism)
Isomorphism ⇔ (Retraction ∧ Monomorphism) ⇔ (Section ∧ Epimorphism)
Retraction ⇒ Surjective morphism ⇒ Epimorphism
Section ⇒ Embedding ⇒ Injective morphism ⇒ Monomorphism
Isomorphism ⇒ Elementary embedding ⇒ Embedding

Functions defined by composition. In any category (whether concrete or not), any f ∈ Mor(E,F) defines functions by currying the operation of composition with other morphisms to or from another object X: let us denote (following wikipedia)

Hom(X,f) = (Mor(X, Dom f)∋g↦ f০g), with target Mor(X,F) for any target F of f.
Hom_F(f,X) = (Mor(F, X)∋g↦ g০f), with target Mor(E,X). In abstract categories where f determines F, the notation simplifies as Hom(f,X).

The first one respects composition, while the second one reverses it: for any 4 objects E,F,G,X , ∀f ∈Mor(E,F), ∀g∈Mor(F,G),

Hom(X,g) ০ Hom(X,f) = Hom(X,g০f)
Hom_F(f,X) ০ Hom_G(g,X) = Hom_G(g০f,X)

Monomorphism. In a category, a morphism f∈Mor(E,F) is called monic, or a monomorphism, if Hom(X,f) is injective for all objects X.

Epimorphism. In an abstract category, a morphism f∈Mor(E,F) is called epic, or an epimorphism, if Hom(f,X) is injective for all objects X: ∀g,h∈Mor(F,X), g০f=h০f ⇒ g=h. In our concept of concrete category, we must specify F: we say that f∈Mor(E,F) is F-epic, or an F-epimorphism, if all Hom_F(f,X) are injective.

In any concrete category, all injective morphisms are monic, and any morphism with image F is F-epic. However, the converses may not hold, and exceptions may be uneasy to classify, especially as the condition depends on the whole category and not just the morphism at hand.

The following 2 concepts may be considered cleaner as they admit a local characterization:

Sections. A morphism f∈Mor(E,F) is called a section (or section in F if the category is concrete), if Id_E∈Im(Hom_F(f,E)), i.e. ∃g∈Mor(F,E),g০f=Id_E. Then f is monic and for all objects X we have Im(Hom_F(f,X)) = Mor(E,X).

Retraction. A morphism g∈Mor(F,E) is called a retraction (or retraction on E if the category is concrete), if Id_E∈Im(Hom(E,g)), i.e. ∃f∈Mor(E,F),g০f=Id_E. Then g is epic and for all objects X we have Im(Hom(X,g)) = Mor(X,F).
When g০f=Id_E we also say that f is a section of g, and that g is a retraction of f.

Proof: if g০f=Id_E then for all objects X, Hom_F(f,X) ০ Hom_E(g,X) = Hom_E(Id_E,X) = Id_Mor(E,X), thus

Hom_E(g,X) is injective (g is epic)
Im(Hom_F(f,X)) = Mor(E,X). Namely, ∀h∈Mor(E,X), h = h০g০f = Hom_F(f,X)(h০g).

Similarly, Hom(X,g) ০ Hom(X,f) = Hom(X,Id_E) = Id_Mor(X,_E) thus

Hom(X,f) is injective (f is monic)
Im(Hom(X,g)) = Mor(X,F).∎

Isomorphism. In a concrete category, an isomorphism between objects E,F , is a bijection f : E ↔ F such that f ∈Mor(E,F) and f^-1∈Mor(F,E).
In any category, an isomorphism is an f ∈Mor(E,F) such that ∃g∈Mor(F,E), g০f= Id_E ∧ f০g= Id_F (this inverse g of f is unique).

In this case, Hom(X,f) and Hom(X,g) are bijections, inverse of each other, between Mor(X,E) and Mor(X,F).

Any epic section f∈Mor(E,F) is an isomorphism : g০f=Id_E ⇒ f০g০f = Id_F০f ⇒ f০g=Id_F
Similarly, any monic retraction is an isomorphism.
Two objects E, F are said to be isomorphic if there exists an isomorphism between them.

Endomorphisms. An endomorphism of an object E in a category, is a morphism from E to itself. Their set is written End(E)=Mor(E,E).

Automorphisms. An automorphism of an object E is an isomorphism of E to itself.

Thus it is both an endomorphism and an isomorphism. However an endomorphism of E which is an isomorphism to a strict subset of E, is not an automorphism.

Strong preservation. A function f ∈ F^E is said to strongly preserve a relation symbol or formula r interpreted in each of E and F, if it preserves both r and ¬r : ∀x∈ E^n_r, x∈r_E ⇔ f০x∈r_F. Embeddings. An f ∈Mor_L(E,F) is called an L-embedding if it strongly preserves all structures : ∀r∈L,∀x∈ E^n_r, x∈r_E ⇔ f০x∈r_F.

Embeddings will usually be supposed injective, as it means strongly preserving the equality relation. Things can come down to this case by replacing equality in the concept of injectivity by a properly defined equivalence relation, or replacing systems by their quotient by this relation, where the canonical surjections would be non-injective embeddings.
Injective embeddings are isomorphisms to their images.
Embeddings still strongly preserve structures defined using the symbols in L and the logical symbols ∧,∨,0,1,¬, and also = in the case of injective embeddings.
Thus, they also preserve invariant structures defined using symbols of L and ∧,∨,¬,0,1,∃ where any occurrence of ¬ comes after (inside) any occurrence of ∃.

Now removing this order restriction on the use of logical symbols provides the full use of first-order logic:

Elementary embedding. An f ∈ Mor_L(E,F) is called an elementary embedding (or elementary L-embedding) if it (strongly) preserves all invariant structures (defined by first-order formulas with language L).

Every isomorphism is an elementary embedding.
If f∈ End(E) is an invariant elementary embedding then it is an automorphism:

Im f is also invariant (as a unary relation ∃y, f(y)=x)
∴ ∀x∈ E, x∈Im f ⇔ f(x)∈Im f
∴ Im f = E. ∎

The existence of an elementary embedding f ∈Mor_L(E,F) implies that E and F are elementarily equivalent:

Elementary equivalence. 2 systems are said to be elementarily equivalent, if every ground first-order formula true in the one is true in the other.

The most usual practice of mathematics focuses on systems where all elementarily equivalent ones are connected by isomorphisms, which are the only elementary embeddings. However, non-surjective elementary embeddings exist and play a special role in foundational issues, such as Skolem's paradox and non-standard models of arithmetic.

Initial and final objects

In any category, an object X is called an initial object (resp. a final object) if for all objects Y the set Mor(X,Y) (resp. Mor(Y,X)) is a singleton.
Such objects have this remarkable property: when they exist, all such objects are isomorphic, by a unique isomorphism between any two of them.

Proof: For any initial objects X and Y, ∃f∈Mor(X,Y), ∃g∈Mor(Y,X), g০f ∈Mor(X,X) ∧ f০g ∈Mor(Y,Y).
But Id_X ∈ Mor(X,X) which is a singleton, thus g০f= Id_X. Similarly, f০g=Id_Y.
Thus f is an isomorphism, unique because Mor(X,Y) is a singleton.∎

By this isomorphism X and Y may be treated as identical to each other. We may say that an initial object is essentially unique.
Such objects exist in many categories, but are not always interesting. For example, in any category of relational systems containing representatives (copies) of all possible ones with a given language:

Final objects are the singletons (where all relations are constantly true),
The only initial object is the empty set (where any nullary relation, i.e. boolean constant, is false).

Exercise. Given two fixed sets X and Y, consider the category whose objects are all (E,φ) where E is a set and φ: E×X→Y, and the morphisms from (E,φ) to (E',φ') are all f : E→E' such that ∀a∈E,∀x∈X, φ(a,x) = φ'(f(a),x).
Does it have an initial object ? a final object ?

3.3. Notion of algebra

Algebras. Given an algebraic language L, an L-algebra is a set E with an interpretation of each s∈L as an n_s-ary operation in E.
Again, let us assume a fixed class of L-algebras E where each s is interpreted as the restriction s_E of an n_s-ary operator s independent of E, s_E = (E^n_s ∋ x↦ s(x)). These can be packed as one function

φ_E = ∐_s∈L s_E = ((s,x) ↦ s(x)) : L⋆E → E.

Such a class of algebras forms a concrete category with the following concept of morphism.

Morphisms of algebras. For any L-algebras E, F,

Mor_L(E,F) = {f∈F^E | ∀(s,x)∈L⋆E, s_F(f০x) = f(s_E(x))} = {f∈F^E| φ_F০f_L = f০φ_E}.

When c∈L is a constant (i.e. n_c =0), this condition on f says f(c_E)=c_F.

Such categories can be seen as particular categories of relational systems, as follows.

Let the relational language L' be a copy of L where to each s∈L corresponds s'∈L' with increased arity n_s' = n_s+1, so that L'⋆E ≡ ∐_s∈L E^n_s×E ≡ (L⋆E)×E also expressible as the set of triples (s,x,y) such that s∈L, x∈ E^n_s and y∈E.
Each n_s-ary operation s_E defines an n_s'-ary relation s'_E ≡ Gr s_E. These are packed as an L'-structure

E = Gr φ_E ≡ ∐_s∈L s'_E.

The conditions for an f ∈ F^E to be a morphism are equivalent : (∀(x,y)∈E, (f_L(x),f(y))∈F) ⇔ (∀x∈L⋆E, φ_F(f_L(x))= f(φ_E(x))).

Every injective morphism f between algebras is an embedding :

∀(s,x,y)∈L'⋆E, f(y) = s_F(f০x) = f(s_E(x)) ⇒ y=s_E(x).

Any embedding f ∈ Mor_L(E,F), is injective as soon as Im φ_E = E or some s_E is injective for one of its arguments.
Bijective morphisms of algebras are isomorphisms. This can be deduced from the fact they are embeddings, or by

φ_E০f_L^-1 = f ^-1০f০φ_E০f_L^-1 = f ^-1০φ_F০f_L০f_L^-1 = f ^-1০φ_F.

Singletons are the final objects in any category of algebras with a fixed language where they are admitted as objects.

Subalgebras. A subset A⊂E of an L-algebra E will be called an L-subalgebra of E, if φ_E[L⋆A]⊂A.
Then the restriction φ_A of φ_E to L⋆A gives it a structure of L-algebra.
The set of all L-subalgebras of E will be denoted Sub_L E. It is nonempty as E ∈ Sub_L E.

For any formula of the form (∀(variables), some formula without any binder), its truth in E implies its truth in each A∈Sub_L E.

Images of algebras. f ∈Mor_L(E,F) ⇒ Im f ∈ Sub_LF.

The proof uses the finite choice theorem with (AC 1)⇒(6):

∀(s,y)∈ L⋆Im f, ∃x∈E^n_s, f০x = y ∴ s_F(y) = f(s_E(x)) ∈ Im f ∎

Thus trying to exend this result to algebras with infinitary operations, would require the axiom of choice, otherwise it anyway still holds for injective morphisms.

Let us generalize the concept of algebra, to any L'-systems (E,E), where E ⊂ (L⋆E)×E needs not be functional. They form the same kind of categories previously defined, with a different notation (through the canonical bijection depending on the choice of distinguished argument) by which more concepts can be introduced.

Stable subsets. The concept of subalgebra is generalized as that of stability of a subset A of E by L : A ∈ Sub_L E ⇔ (E_*(L⋆A) ⊂A) ⇔ (∀(s,x,y)∈E, Im x⊂A ⇒ y∈A). Stability is no more preserved by direct images by morphisms, but is still preserved by inverse images:

Preimages of stable subsets. ∀f ∈ Mor_L(E,F), ∀B ∈ Sub_LF, f *(B)∈Sub_L E.

Let A=f *B. Proof for L-algebras:
∀(s,x)∈L⋆A, f০x ∈ B^n_s∴ f(s_E(x)) = s_F(f০x) ∈B ∴ s_E(x)∈A.
Proof for L'-systems:
∀(x,y)∈E, (f_L(x),f(y))∈F∴ (x∈L⋆A ⇒ f_L(x)∈L⋆B ⇒ f(y)∈B ⇒ y∈A).∎

Proposition. For any L'-system E and any L-algebra F,
∀f,g∈Mor_L(E,F), {x∈E|f(x)=g(x)}∈ Sub_LE.

Proof : ∀(s,x,y)∈E, f০x=g০x ⇒ f(y) = s(f০x) = g(y). ∎

Intersections of stable subsets. ∀X ⊂ Sub_LE, ∩X ∈ Sub_L E where ∩X ≝ {x∈E|∀B∈X, x∈B}.

Proof: ∀(x,y)∈E, x∈L⋆∩X ⇒ (∀B∈X, x∈L⋆B ∴ y∈B) ⇒ y∈∩X. ∎

Other way: E_*(L⋆∩X) = E_*(∩_B∈X L⋆B) ⊂∩_B∈X E_*(L⋆B) ⊂∩X.

Subalgebra generated by a subset. ∀A ⊂ E, the L-subalgebra of E generated by A, written 〈A〉_L,E or more simply 〈A〉_L, is the smallest L-subalgebra of E including A:

〈A〉_L= ∩{B∈Sub_LE|A⊂B}= {x∈E|∀B∈Sub_LE, A⊂B ⇒ x∈B}∈ Sub_LE.
For fixed E and L, this function of A is a closure with image Sub_LE.
We say that A generates E if 〈A〉_L=E.

Minimal subalgebra. The minimal subalgebra of an algebra E, is its subalgebra Min_LE =〈∅〉_L,E = ∩Sub_LE. (This can also be used on stable subsets of algebraic systems which are not algebras)

An L-algebra E is minimal when E = Min_L E, or equivalently Sub_LE = {E}. Then Min_LE is the only subalgebra of E that is a minimal algebra.
We can redefine generated subalgebras in terms of minimal subalgebra with a different language: 〈A〉_L,E= Min_L∪A E where A is seen as a set of constants.

Proposition. For any L-algebra E,

∀A⊂E, Imφ_E⊂A ⇒ A∈Sub_LE. In particular, Im φ_E∈Sub_LE.
∀A⊂E, A⊂〈A〉_L⊂A∪ Imφ_E. In particular, Min_LE ⊂ Imφ_E
∀A⊂E, 〈A〉_L=E ⇒ A∪Imφ_E = E.
∀f ∈Mor_L(E,F), f [Min_LE] = Min_LF (images of minimal algebras by morphisms are minimal)

Proofs:

φ_E[L⋆A] ⊂ Imφ_E⊂A
Imφ_E⊂ A∪Imφ_E ∴ A∪Imφ_E∈Sub_LE
∀B ∈ Sub_LF, f *(B)∈Sub_L E ∴ Min_LE ⊂ f*(B) ∴ f [Min_LE]⊂B.∎

Injective, surjective algebras. An L-algebra (E,φ_E) will be called injective if φ_E is injective, and surjective if Im φ_E = E.
Any minimal L-algebra is surjective. Thus, the minimal sub-algebra of any algebra is also a surjective algebra.

Proposition. If E is a surjective algebra and F is an injective one then
∀f ∈Mor_L(E,F), A= {x∈E | ∀y∈E, f(x) = f(y) ⇒ x=y} ∈ Sub_LE.

Proof: ∀(s,x)∈L⋆A, ∀y∈E,
f(s_E(x)) = f(y) ⇒ (∃(t,z)∈φ_E^•(y), s_F(f০x) = f(s_E(x)) = f(y) = f(t_E(z)) = t_F(f০z) ∴ (s=t ∧f০x=f০z) ∴ x=z) ⇒ s_E(x)=y.

Other view : under the same assumptions, for each uniqueness quantifier Q (either ∃! or !),
B = {y∈F | Qx∈E, y = f(x)} ∈ Sub_LF

Proof: as φ_F is injective, ∀y∈φ_F[L⋆B], ∃!: φ_F^•(y) ⊂ L⋆B ∴ Qz∈ L⋆E, φ_F(f_L(z)) = y.
As φ_F০f_L = f০φ_E and φ_E is surjective, we conclude Qx∈E, y = f(x). ∎

3.4. Algebraic terms and term algebras

Algebraic drafts

As the concept of term was only intuitively introduced in 1.5, let us now formalize the case of terms using a purely algebraic language (without logical symbols), called algebraic terms, as mathematical systems in a set theoretical framework.
For convenience, let us work with only one type (the generalization to many types is easy) and introduce a class of systems more general than terms, that we shall call drafts. Variables will have a special treatment, without adding them as constants in the language.

Given an algebraic language L, an L-draft will be an L'-system (D,D) where D⊂ (L⋆D)×D, such that:

The transpose ^tD of D is the graph of a function Ψ_D: O_D → L⋆D, whose domain O_D = Im D ⊂D is called the set of occurrences in D, and its complement V_D=D\O_D is called the set of variables of D;
〈V_D〉_L= D (well-foundedness condition).

Let us denote ∀x∈O_D, Ψ_D(x)=(s(x), l_x) ∈ L⋆D where s(x)∈L and l_x∈D^n_s^_(x). We can also denote s_D(x)=s(x) to let s_D be a function with domain O_D.
Well-foundedness can be equivalently written in any of these forms

∀A⊂O_D, (∀x∈O_D, Im l_x ⊂ A∪V_D ⇒ x∈A) ⇒ A=O_D
∀A⊂D, (V_D⊂A ∧ D_*(L⋆A) ⊂A) ⇒ A = D
∀A⊂D, V_D⊂A≠D ⇒∃x∈O_D\A, Im l_x⊂A
∀A⊂O_D, A≠∅ ⇒ ∃x∈A, A⋂ Im l_x = ∅

The set of used variables of (D,D), those which effectively occur, is V_D⋂⋃_{x∈O_D} Im l_x. Unused variables can be added or removed in D while keeping D fixed (by changing D⊃O_D∪(⋃_{x∈O_D} Im l_x)), so that their presence may be irrelevant.
A ground draft is a draft with no variable, i.e. V_D=∅. Thus, Ψ_D: D→ L⋆D and Sub_LD = {D}.
More generally a draft is ground-like if it has no used variable (Dom D⊂ L⋆O_D).

Sub-drafts and terms

Drafts do not have interesting stable subsets (by well-foundedness), but let us introduce another stability concept for them.
A subset A⊂D is a sub-draft of D (or a co-stable subset of D) if, denoting O_A = A⋂O_D and Ψ_A= Ψ_{D|O_A}, we have (Im Ψ_A⊂ L⋆A), i.e. ∀x∈O_A, Im l_x⊂A.
Indeed, it remains well-founded, as can be seen on the last formulation of well-foundedness.

Like with stable subsets, any intersection of sub-drafts is also a sub-draft; the sub-draft co-generated by a subset is the intersection of all sub-drafts that include it.
A term is a draft co-generated by a single element which is its root.
Moreover, any union of sub-drafts is also a sub-draft (which was not the case for sub-algebras because an operation with arity >1 whose arguments take values in different sub-algebras may give a result outside their union).

There is a natural order relation on each draft D defined by x ≤ y ⇔ x∈ (the term with root y). It is obviously a preorder. Its antisymmetry is less obvious; a proof for integers will be given in 3.6, while the general case will come from properties of well-founded relations in the study of Galois connections.

Categories of drafts

As particular relational systems, classes of L-drafts form concrete categories. Between two L-drafts D,E,

f ∈Mor_L(D,E) ⇔ (f[O_D]⊂O_E ∧ Ψ_E ০ f_{|O_D}= f_L০Ψ_D)

where the equality condition can be split as

s_E০f_{|O_D} = s_D
∀x∈O_D, l_f(x)=f০l_x

Another kind of category of drafts can be considered, with objects also L-drafts but with a common set of variables (V_D=V_E=V) and taking smaller sets of morphisms: the variables-preserving morphisms, i.e. moreover satisfying

f_|V = Id_V

But for any element t in any draft, the term T co-generated by {t} has as set of variables T⋂V (which is the set of used variables of T unless T={t}⊂V) generally smaller than V, so the admission of terms defined as subsets co-generated by singletons in such a category requires this loosening of the condition.

This naturally simplifies when reformulating such categories as those of ground (L∪V)-drafts: in each draft, variable symbols are replaced (reinterpreted) by constant symbols added to the language, so Ψ_E is extended by Id_V, to form a ground (L∪V)-draft.

Intepretations of drafts in algebras

For any L-draft D and any L-algebra E, an interpretation of D in E is a morphism f∈Mor_L(D,E), i.e. f_{|O_D}= φ_E০f_L০Ψ_D, which can also be written

∀x∈O_D, f(x) = s(x)_E(f০l_x)

Theorem. For any L-draft D with set of variables V and any L-algebra E, any v∈E^V is uniquely extensible to an interpretation of D:
∃!h∈Mor_L(D,E), h_|V = v, equivalently ∃!h∈E^O_D, v∪h ∈Mor_L(D,E).

The uniqueness comes from a previous proposition.

Proof of existence.
S = {A⊂D | V⊂A∧ Im Ψ_A⊂ L⋆A}
v ∈ K = ⋃_A∈_S {f∈Mor_L(A,E) | f_|V =v}
∀f,g ∈K, B = Dom f ⋂ Dom g ⇒ (f_|B∈K ∧ g_|B∈K) ⇒ f_|B=g_|B
⋃_f∈K Gr f = Gr h
C= Dom h = ⋃_f∈K Dom f ∈ S
h ∈ K
(C∪D_*(L⋆C) ∋ x↦ (x∈C ? h(x) : φ_E(h_L(Ψ_D(x))))) ∈ K (see conditional operator)
D_*(L⋆C) ⊂ C
C=D ∎

Term algebras

An L-algebra (E,φ_E) is called a term algebra if it is injective and 〈E\Im φ_E〉_L = E. Thus it is also an L-draft with Ψ_E = φ_E^-1. As such, it is ground if moreover E=Im φ_E.
So, a ground term algebra is an algebra both minimal and injective, and thus also bijective.
By the above interpretation theorem, in any variables-preserving category of L-drafts with a fixed set V of variables (category of ground (L∪V)-drafts), any term L-algebra F, when present, is a final object. Thus any 2 of them are isomorphic, by a unique variables-preserving isomorphism.
In particular, any ground term L-algebra is a final object in any category of ground L-drafts, and an initial object in any category of L-algebras.

Proposition. For any ground term L-algebra K and any injective L-algebra M, the unique f∈Mor_L(K,M) is injective.

Proof 1. By a previous result, {x∈K | ∀y∈K, f(x) = f(y) ⇒ x=y} ∈ Sub_LK, thus = K.

Proof 2. The subalgebra Im f of M is both injective (subalgebra of an injective algebra) and minimal, thus a ground term L-algebra, and the morphism f between initial L-algebras K and Im f is an isomorphism.

Any term algebra F plays the role of the "set of all terms" with its list V of variable symbols, for the following reason:

Each element of F bijectively defines a term in F as the sub-draft of F it co-generates, thus where it is the root.
For any L-term T with root t and variables ⊂V, the unique f∈Mor(T,F) such that f_|T⋂V = Id_T⋂V represents it in F as the term Imf with root f(t).
Then its interpretation in any L-algebra E extending any v∈E^V, is determined by the unique g∈Mor_L(F,E) extending v, as g০f∈Mor(T,E), with result g(f(t)).

The same for terms whose set of variables V' is interpreted in E by the composite of a function from V' to V, with one from V to E (instead of having V'⊂V).
For any subset A of an L-algebra E, any term algebra F_A whose set of variables is a copy of A, represents the set of all L-terms with variables interpreted in A. Then, the L-subalgebra 〈A〉_L of E is the image of the interpretation of F_A in E, i.e. the set of all values of these terms.

Sets of morphisms between L-algebras E,F remain unchanged when adding to L any operation symbol defined by an L-term T with root t and variables among V :

Mor_L(E,F) ⊂ Mor_T(E,F) ⊂ Mor_{t}(E,F)

Proof:

The interpretation of T as a set of V-ary algebraic symbols in each E is defined by ∀v∈E^V, ∀t∈T, t_E(v) = g_v(t) where g_v∈Mor_L(T,E) and g_|T⋂V=v.
Thus ∀f∈Mor_L(E,F), ∀v∈E^V, (f০g_v∈Mor_L(T,F) ∧ (f০g_v)_|T⋂V = f০v) ∴ t_F(f০v) = (f০g_v)(t) = f(t_E(v)). ∎

If L does not contain any constant then ∅ is a ground term L-algebra.
If L only contains constants, then ground term L-algebras are the copies of L.

3.5. Integers and recursion

The set ℕ

Any theory that aims to describe the system ℕ of natural numbers is called an Arithmetic. In details, there are several such formal theories, depending on the language (list of basic structures), and the logical framework (affecting the expressibility of axioms).

Our first "definition" of ℕ will characterize it in a set theoretical framework. This way of starting to formalize ℕ now may look circular, as we already used natural numbers as arities of operation symbols of algebras, of which arithmetic is a particular case. But it only uses operation symbols with arity 0, 1 or 2, for which previous definitions might as well be specially rewritten without any general reference to integers.

Definition. The set ℕ of natural numbers is a ground term algebra with a language of two symbols: one constant symbol 0 ("zero"), and one unary symbol S.

The interpretation of S there is called the successor, understood as adding one unit (Sn=n+1).

This concept of ground term algebra can be expanded as the following 3 axioms on this {0,S}-algebra :

∀n∈ℕ, Sn ≠ 0	(H0), i.e. 0 ∉ Im S
∀n,p∈ℕ, Sn = Sp ⇒ n = p	(Inj), i.e. S is injective
∀A⊂ℕ, (0∈A ∧ ∀n∈A,Sn∈A) ⇒ A=ℕ	(Ind) : induction axiom (ℕ is a minimal (0,S)-algebra).

We can define 1=S0, 2=SS0...

This insertion of arithmetic into set theory, adding to set theory the constant symbol ℕ and whatever chosen language for arithmetic interpreted in this ℕ, is the natural way set theory is completed to form the standard foundation of mathematics as practiced by most mathematicians. Its implicit choice of a fixed ℕ in the class of ground term {0,S}-algebras, does not introduce any arbritrariness, since, as an initial {0,S}-algebra, it is essentially unique (any two systems in this class are identifiable by an isomorphism which is unique inside the same universe). A set E is called countable if there exists an injection from E to ℕ.

Recursively defined sequences

A sequence of elements of a set E, is a function from ℕ to E (a family of elements of E indexed by ℕ).
In particular, a recursive sequence in E is a sequence defined as the only u ∈ E^ℕ such that u ∈ Mor(ℕ,(E,a,f)), where (E,a,f) is the {0,S}-algebra E interpreting 0 as a∈E and S as f∈ E^E :

u₀=a
∀n∈ℕ, u_Sn = f(u_n).

As u_n also depends on a and f, let us write it as fⁿ(a). This notation is motivated as follows.
As an element of a ground term {0,S}-algebra, each integer n represents a term with symbols 0 and S respectively interpreted in E as a and f. So, fⁿ(a) abbreviates the term with shape n using symbols a and f:

f⁰(a) = a
f¹(a) = f(a)
f²(a) = f(f(a))

In another curried view of this map from E×E^E×ℕ to E we can re-interpret 0 as a variable instead of a constant symbol. Then each integer n becomes a term Sⁿ with n occurrences of the function symbol S and one variable, interpreted in each {S}-algebra (E,f) as the function fⁿ∈ E^E recursively defined by

f⁰ = Id_E
∀n∈ℕ, f^Sn = f০fⁿ

In particular, f¹=f.
More generally, for any functions f∈ E^E, g∈E^X, the sequence of functions recursively defined by

h₀=g
∀n∈ℕ, h_Sn = f০h_n

is h_n=fⁿ০g.

Addition

The operation of addition in ℕ can be defined as n+p = S^p(n), i.e. by the recursive definition

n + 0 = n
∀p∈ℕ, n+S(p) = S(n+p).

Thus,

n+1 = n+S0 = S(n+0) = Sn

As ∀a∈E ,∀f∈ E^E, (p↦f^n+p(a))∈Mor(ℕ,(E,fⁿ(a),f)), i.e. fⁿ⁺⁰=fⁿ and ∀p∈ℕ, f^n+Sp = f^S(n+p) = f০f^n+p we have

∀n,p∈ℕ , f^n+p = f^p০fⁿ.

Addition is associative: (a+b)+n = Sⁿ(S^b(a)) = S^b⁺ⁿ(a) = a+(b+n).

Multiplication

Multiplication in ℕ can be defined as x⋅y = (S^x)^y(0), so that

∀x∈ℕ, x⋅0 = 0
∀x,y∈ℕ, x⋅(Sy) = (x⋅y)+x

More generally, for any a∈E and f∈ E^E, we have f^x^⋅^y(a) = (f^x)^y(a).

A more general form of recursion

Some useful sequences need recursive definitions where the term defining u_Sn uses not only u_n but also n itself. Somehow it would work all the same, but trying to directly adapt to this case the proof we gave would require to define some special generalizations of previous concepts, and specify their resulting properties. To simplify, let us proceed another way, similar to the argument in Halmos's Naive Set Theory, but generalized.
For any algebraic language L, let us introduce a general concept of "recursive condition" for functions u : E → F, where, instead of a draft, E is first assumed to be an L-algebra (then a ground term algebra to conclude).
The version we saw was formalized by giving the term in the recursive definition as an L-algebra structure on F, φ_F: L⋆F → F, then expressing the request for u to satisfy this condition as u∈Mor(E,F), namely

∀(s,x)∈L⋆E, u(s_E(x)) = φ_F(s,u০x).

Let us generalize this as u(s_E(x)) = φ(s,x,u০x) which by the canonical bijection Dom φ ≡ ∐_s∈L E^n_s×F^n_s ≡ ∐_s∈L (E×F)^n_s = L⋆(E×F) can be written using h : L⋆(E×F) → F such that ∀(s,x,y)∈ Dom φ, φ(s,x,y) = h(s,x×y), as

u(s_E(x)) = h(s,x×(u০x)).

As ∀u∈F^E, x×(u০x) = (Id_E×u)০x, this becomes the second component of the formula Id_E×u ∈ Mor(E, E×F) when giving E×F the structure φ_E×F = (φ_E০π_L)×h.
The first component (φ_E০π_L) we give to φ_E×F, makes π∈ Mor(E×F, E) and makes tautological the first component of the formula Id_E×u ∈ Mor(E, E×F), namely
Id_E(s_E(x)) = φ_E(s,x) = (φ_E০π_L)(s,x×(u০x)).
It is then possible to conclude by re-using the previous result of existence of interpretations:

If E is a closed term L-algebra then ∃!f ∈ Mor(E, E×F), which is of the form Id_E×u because π০f ∈ Mor(E, E) ∴ π০f = Id_E.

But one can do without it, based on the following property of this L-algebra E×F:

∀u∈F^E, Id_E×u ∈ Mor_L(E, E×F) ⇔ Gr u ∈ Sub_L(E×F)

The ⇒ is a case of image of an algebra by a morphism, Gr u = Im (Id_E×u).
For the converse, the inverse of the bijective morphism π_{|Gr u} ∈ Mor_L(Gr u, E) is a morphism Id_E×u ∈ Mor_L(E, Gr u) ⊂ Mor_L(E, E×F).
This reduces the issue to the search of subalgebras of E×F which are graphs of functions from E to F.
Now if E is a ground term L-algebra then M = Min_L(E×F) is one of them because π_|M∈ Mor_L(M, E) from a surjective algebra to a ground term algebra must be bijective.
Any other subalgebra of E×F must include M, thus to stay functional it must equal M. ∎

Commutativity, associativity and commutants

A binary operation # in a set E, is called

Commutative when ∀x,y∈E, x#y = y#x.
Associative when ∀x,y,z ∈E, x#(y#z) = (x#y)#z, so that we can write x#y#z to mean either of these terms.

The commutant of any subset A⊂E for a binary operation # in E, is defined as

C(A) = {x∈E|∀y∈A, x#y = y#x}.

This is a Galois connection: ∀A,B⊂E, B⊂C(A) ⇔ A⊂C(B).
C(E)= E expresses the commutativity of #.

Proposition. For any associative operation # on a set E, ∀A⊂E,

C(A) ∈ Sub_#F
If A⊂C(A) then # is commutative in 〈A〉_#

Proof:

∀x,y∈C(A), (∀z∈A, (x#y)#z = x#z#y = z#(x#y)) ∴ x#y∈C(A)
A⊂C(A)∈ Sub_#F ⇒〈A〉_#⊂C(A) ⇒ A⊂C(〈A〉_#)∈ Sub_#F ⇒ 〈A〉_#⊂C(〈A〉_#).

This concept will first be used for arithmetic, then generalized to monoids and clones.

3.6. Arithmetic with addition

First-order theories of arithmetic

Our first formalization of ℕ was based on the framework of set theory, where it used the powerset to characterize ℕ in an "essentially unique" way (specify its isomorphism class). It allowed recursion, from which we could define addition and multiplication.

Let us now consider formalizations in the framework of first-order logic, thus called first-order arithmetic. As first-order logic cannot fully express the powerset, the (∀A⊂ℕ) in the induction axiom must be replaced by a weaker version : it can only be expressed with A ranging over all classes of the theory, that is, the only subsets of ℕ defined by first-order formulas in the given language, with bound variables and parameters in ℕ. For this, it must take the form of a schema of axioms, one for each formula that can define a class.

But without the set theoretical framework we used to express and justify the definiteness of recursive definitions, the successor function no more suffices to define addition and multiplication. This leaves us with several non-equivalent versions of first-order arithmetic depending on the choice of the primitive language, thus non-equivalent versions of the axiom schema of induction whose range of expressible classes depends on this language:

Bare arithmetic, with the mere symbols 0 and S, is very poor.
Presburger arithmetic, with addition, starts to be interesting, but still cannot define multiplication.
Full arithmetic, with addition and multiplication, is finally able to express all recursive definitions.

Presburger arithmetic

Presburger arithmetic has been proven by experts to be a decidable theory, i.e. all its ground formulas are either provable or refutable from its axioms. Let us present its shortest equivalent formalization, describing the set ℕ* of nonzero natural numbers, with 2 primitive symbols: the constant 1 and the operation +. Axioms will be

∀x,y∈ℕ, x + (y+1) = (x+y*)+1	(A1) : + is associative on 1
∀A⊂ℕ,(1∈A ∧ ∀x,y∈A, x+y∈A) ⇒A=ℕ	(Min)
∀x,y∈ℕ*, x + y ≠ y	(F)

In set theory, (Min) would make ℕ* a minimal {1,+}-algebra. But let us regard our present use of set theoretical notations as mere abusive abbreviations of works in first-order logic, as long as we only consider subsets of ℕ* defined by first-order formulas in this arithmetical language. In particular, (Min) will be meant as abbreviating a schema of axioms with A ranging over all classes in this theory.
(A1) is a particular case of

∀x,y,z∈ℕ*, x + (y+z) = (x+y)+z

(As) : + is associative

Conversely, (A1 ∧ Min) ⇒ (As) :

Let A={a∈ℕ* |∀x,y ∈ℕ*, x+(y+a) = (x+y)+a}. ∀a,b∈A,
∀x,y ∈ℕ*, x + (y+(a+b)) = x + ((y+a)+b) = (x + (y+a))+b = ((x + y)+a)+b = (x+y)+(a+b)
∴ a+b ∈ A.
(A1) ⇔ 1∈A.
(A1 ∧ Min) ⇒ A=ℕ* ∎

(As ∧ Min) ⇒ (+ is commutative), as deduced from 1∈C({1}) by a previous proposition.

Now take ℕ = ℕ*∪{0} where 0∉ℕ*, to which + is extended as ∀n∈ℕ, 0+n = n+0 = n. This extension preserves its properties of commutativity and associativity.
Define S as ℕ∋x↦ x+1.
These definitions directly imply (H0).

(Ind) ⇒ (Min) :

∀A⊂ℕ*, the set A₀= A∪{0} satisfies 0∈A₀ and
(1∈A ∧ (∀x,y∈A, x+y ∈A)) ⇒ (S0∈A₀ ∧ (∀x∈A, Sx=x+1 ∈A⊂A₀)) ⇒ A₀=ℕ∎

(As ∧ Min) ⇒ (Ind)

Let A∈Sub_{0,S}ℕ, and B = {y∈ℕ* |∀x∈A, x+y∈A}.
∀y,z∈B, (∀x∈A, x+y∈A ∴ x+y+z∈A) ∴ y+z ∈B.
(∀x∈A, x+1 ∈A) ⇔ 1∈B ⇒ ((Min)⇒ B=ℕ*).
0∈A ⇒ (∀y∈B, 0+y∈A) ⇒ B⊂A∎

(F) ⇔ (∀x∈ℕ*, ∀y∈ℕ, x+y ≠ y) because x+0 = x ≠ 0.
(Inj ∧ Ind ∧ A1) ⇒ (F) : ∀x∈ℕ*,

x+0 ≠ 0
∀y∈ℕ, x+y ≠ y ⇒ x+y+1 ≠ y+1∎

For the converse, we need to use the order relation.

The order relation

From the operation of addition, let us define binary relations < and ≤ on ℕ and show that they form an order and its strict order (even in first order arithmetic); its equivalence with the definition of the order between terms in set theory in the common particular case of ℕ with full set theoretical induction, will be left here as an intuitive fact.

x<y ⇔ ∃z∈ℕ*, y = x+z
x≤y ⇔ ∃z∈ℕ, y = x+z

These are consequences of (Ind ∧ A1) :

< is transitive
x≤y ⇔ (x<y ∨ x=y)
x<y ⇔ x+1≤y
∀A⊂ℕ, A≠∅ ⇒ ∃x∈A, ∀y∈A, x≤y (this may also be interpreted either set theoretically or as a schema in first order logic)
∀x,y∈ℕ, x≤y ∨ y≤x
x<y ⇒ x+z < y+z

Proofs :

using (As), x < y < z ⇒ (∃n,p∈ℕ*, z = y+p = x+n+p) ⇒ x <z.
obvious from definitions;
thanks to (Ind), ℕ is a bijective {0,S}-algebra;
x≤y ⇒ (x+1≤y ∨ x=y)
0∈{x∈ℕ |∀y∈A, x≤y}=B
∀x∈B, x+1∈B ∨ x∈A
A⋂B=∅ ⇒ (B=ℕ ∴ A=∅)
from 4. with A={x,y}
y = x+n ⇒ y+z = x+z+n ∎

(for 5. it is also possible to more directly prove for A={x∈ℕ |∀y∈ℕ, x<y ∨ x=y ∨ y<x} that 0∈A and ∀x∈A, x+1∈A)

Now, (F) means that < is irreflexive, thus a strict total order thanks to 1. and 5.

Moreover it implies ∀x,y,z∈ℕ, (x<y ⇔ x+z < y+z) and (x = y ⇔ x+z = y+z), which gives (Inj) as a particular case.

Proof: the direct implications were shown above; the converses are deduced from there as < is a strict total order : one of (x<y, x = y, y<x) must be true while only one of (x+z<y+z, x+z=y+z, y+z<x+z) can.∎
Finally, ≤ is a total order with strict order < and every nonempty subset of ℕ has a smallest element, which is unique by antisymmetry.

Arithmetic with order

It is possible to express a first-order arithmetic with language {0,S, ≤}, stronger than {0,S} but weaker than Presburger arithmetic, in the sense that addition cannot be defined from ≤.
Namely, it can be based on the characteristion of the order by the property:
For all n ∈ℕ, the set {x∈ℕ | n ≤ x} is the unique A⊂ℕ such that

∀x∈ℕ, x∈A ⇔ (x = n ∨∃y∈A, Sy=x).

Its existence in ℘(ℕ) can be deduced by induction on n; its uniqueness for a fixed n is deduced by induction on x.