Note: This is mostly all stuff you’d probably learn in your first six months of a maths degree, I just like occasionally chewing on the fundamentals and seeing how they fit together. There’s nothing very deep here, but it might be pedagogically interesting.

You’re probably familiar with the natural numbers: They’re a set \(\mathbb{N}\) with a distinguished element \(0 \in \mathbb{N}\) and a successor function \(S: \mathbb{N} \to \mathbb{N}\) which is injective, has \(0 \ne S(n)\) for all \(n \in \mathbb{N}\) and has the property that if \(A \subseteq N\) with \(0 \in A\) and \(S(A) \subseteq A\) then \(A = \mathbb{N}\) (“the induction principle”).

The natural numbers are the most basic example of an infinite set, and are usually postulated to exist as an axiom of set theory. We then construct all other infinite sets starting from \(\mathbb{N}\) using e.g. power set operations.

But what is an infinite set?

There turn out to be more or less three ways of defining this, all of which are equivalent (under certain assumptions).

One way of defining it is any set which is not a finite set, but that reduces to the question “What is a finite set?”. To which the answer is any set which is in bijection (one to one correspondence) with one of the sets \(\{i \in \mathbb{N} : i < n\}\) for some \(n \in \mathbb{N}\) (you need to define the relation \(<\) on \(\mathbb{N}\) for this, but lets assume we’ve done that in some standard way e.g. by induction).

You can show inductively then that \(\mathbb{N}\) has no injection into a finite subset of itself (and thus is infinite) as follows: It’s certainly not in bijection with the empty set, which is \(\{i: i < 0\}\). Suppose you had \(f: \mathbb{N} \to \{i \in \mathbb{N} : i < n + 1\}\) a bijection. By composing it with a swap if necessary, you can assume that \(f(0) = n\). But then \(f \cdot S\) is a bijection with \(\{i \in \mathbb{N} : i < n \}\), which contradicts our induction hypothesis.

Another definition of a set being infinite is that it contains a bijective copy of \(\mathbb{N}\) (i.e. there is an injection from \(\mathbb{N}\) into it).

If there is such an injection then the set is certainly infinite in the sense of not being finite, as otherwise we’d be able to compose the injection from \(\mathbb{N}\) into it with its bijection to a finite set, which contradicts what we showed above about \(\mathbb{N}\) being infinite.

To show the opposite direction we need the axiom of choice (we actually only need a weak form of it, but we’ll ignore that detail): For any set \(X\) there is a choice function \(c : \mathcal{P}(X) \setminus \{\emptyset\} \to X\) with \(c(A) \in A\).

If \(X\) is infinite in the sense of not being finite, we can then construct a function \(f: \mathbb{N} \to X\) as follows:

- \(f(0) = c(X)\)
- \(f(S(n)) = c(X \setminus \{f(0), \ldots f(n)\})\)

This is well defined because the set we’re passing to \(c\) must be non-empty, or we’d have constructed a bijection with \(\{0, \ldots n\} = \{i : i < n + 1\}\).

This definition of infinity justifies the fundamental role of the natural numbers (and the title) to some degree: They’re literally the smallest infinite set, because a copy of them is contained in every other infinite set.

But finally, there is a third definition of infinity which has no reference to the natural numbers at all: \(X\) is infinite if there is some injective function \(f: X \to X\) whose range is a proper subset of \(X\). i.e. \(f\) is injective but not surjective. You can justify this by analogy with the natural numbers as our basic example of an infinite set: \(S\) is such a function, because it is injective but does not map anything to \(0\). You can also show inductively that all of the sets \(\{i: i < n\}\) do not have this property (more or less by the same argument we used to show that there was no injection from \(\mathbb{N}\) into them).

Lets see that this is equivalent to the previous definition.

First, suppose we’ve got some injection \(h: \mathbb{N} \to X\). We can construct such an \(f\) as follows:

- if \(x \neq h(n)\) for any \(n\), then \(f(x) = x\)
- Let \(f(h(n)) = h(S(n))\)

i.e. we define our function to be the successor function on our copy of \(\mathbb{N}\) and the identity elsewhere.

This is still injective, because it’s injective on each member of the partition into \(h(\mathbb{N})\) and \(X \setminus h(\mathbb{N})\), and it doesn’t map either part into the other. It is however not surjective, because it maps no values to \(h(0)\).

So a set which contains a copy of \(\mathbb{N}\) must necessarily be infinite in this sense.

Now, suppose we have such an \(f\). We can use this to construct an injection from \(\mathbb{N}\) into \(X\) as follows:

- Pick \(h(0)\) as an arbitrary element of \(X \setminus f(X)\)
- Let \(h(S(n)) = f(h(n))\)

We can prove this is injective by inductively showing that \(h(n) \not\in \{h(m): m < n\}\)

Proof: Trivially true for \(0\) because that’s the empty set. For \(S(n)\), note that again \(h(S(n)) \ne 0\) because by choice of \(h(0)\) we must have \(f(x) \ne h(0)\) for any \(x\).

Supposing \(h(S(n)) = h(m)\) for some \(0 < m \leq n\), we must have \(f(h(n)) = h(m)\). But we can write \(m = S(k)\) for some \(k\) because \(m \neq 0\), so we have \(f(h(n)) = f(h(k))\). \(f\) is an injection, so \(h(n) = h(k)\). By our inductive hypothesis, \(h\) is an injection on \(\{1, \ldots, n\}\), so we must have \(n = k\). Therefore \(h\) is an injection as desired.

So we have three notions of infinity, all equivalent (up to the axiom of choice). Why do we care?

Well, one interesting thing about the last definition is that it’s a notion of infinity that has nothing to do with the natural numbers. This has the nice benefit that we can run it backwards: If we have a set that is infinite in that sense, then we can construct the natural numbers out of it. This is a pretty strong justification of the natural numbers as our fundamental notion of an infinite set.

To see how to do this, suppose we’ve got a set \(X\) and an injective function \(S : X \to X\) which is not surjective.

Pick some distinguished element \(0 \in X \setminus S(X)\).

Now consider the family \(\mathcal{F}\) of sets \(U \subseteq X\) with the properties that \(0 \in U\) and \(S(U) \subseteq U\). This family is nonempty because trivially \(X \in \mathcal{F}\).

Now, define \(N = \bigcap \mathcal{F}\).

We must have \(N \in \mathcal{F}\), because \(0 \in N\) and if \(U \in \mathcal{F}\) then \(S(N) \subseteq S(U) \subseteq U\), so taking intersections over all \(U \in \mathcal{F}\) we have \(S(N) \subseteq \bigcap \mathcal{F} = N\).

Moreover, if \(A \subseteq N\) with \(0 \in A\) and \(S(A) \subseteq A\), then \(A \in \mathcal{F}\), and thus \(N \subseteq A\), and so \(N = A\).

But these are precisely the requirements for \(N\) to be a model of the natural numbers, \(\mathbb{N}\)! We have a distinguished element \(0\), an injective function \(S\) that doesn’t map anything to \(0\), and it satisfies the induction principle. And, importantly, we got here without assuming that anything that looked like the natural numbers existed: All we required was the apparently weaker claim about some form of infinite set existing.

So pretty much wherever you start in defining infinite sets, you’re going to end up with the natural numbers at some point.

In the absence of at least some weak form of the axiom of choice these equivalences break down a bit, but they’re still compelling enough and strong enough that they fairly firmly place the natural numbers as the foundational object of infinite set theory.