David R. MacIver's Blog: Direct proofs from the well ordering theorem

Direct proofs from the well ordering theorem

4 December 2015

In the course of writing yesterday’s post about Zorn’s lemma I noticed a different proof of the theorem that every vector space has a basis:

Let $V$ be a vector space and define some well-ordering $<$ on it. Let $B$ be the set of $x$ such that $x$ is not in the span of $\{y: y < x\}$.

Claim: B is a basis.

Proof: Certainly $B$ is linearly independent: If not, let $x_1 < \ldots < x_n \in B$ be linearly dependent. Then $x_n$ is in the span of $x_1, \ldots, x_{n - 1}$, which are $< x_n$, contradicting the definition of $B$.

Claim: For all $x$, $x$ is in the span of $\{y \in B: y \leq x\}$.

Proof: Transfinite induction. Either $x$ is not in the span of $\{y: y < x\}$, and thus $x$ is actually in this set and thus certainly in the span. Else, we can find $y_1, \ldots, y_n < x$ such that $x$ is in their span. But by induction assumption, each of these is in the span of $\{y \in B: y < x\}$, and thus so is $X$.

QED

To some extent this is just the normal proof in disguise, but I thought it was cute. I highly doubt it’s in any way original to me, but I hadn’t seen it before.

This made me wonder what other things you could prove this way. I’ve rarely seen the well ordering theorem used as a direct proof method except when you want to make some careful argument about cardinalities (e.g. one way the Continuum Hypothesis is often used is to put a well-ordering on $\mathbb{R}$ such that all the initial segments are countable).

For starters, the above proof can relatively easily be turned into a proof of the Teichmüller–Tukey lemma using essentially the same construction:

Let $X$ be some set, and let $L \subseteq \mathcal{P}(X)$ be some family of sets of finite character. Define the function $f : X \to \{0, 1\}$ as follows:

Let $T = \{y < x: f(y) = 1\}$.

Let $f(x) = 1$ if $T \cup \{x\} \in L$, else $f(x) = 0$.

Claim: $S = \{x : f(x) = 1\}$ is a maximal element of $L$.

Proof: First we show that it’s in $L$. If not, there is some finite subset $x_1 < \ldots \x_n \in S$ such that $\{x_1, \ldots, x_n\} \not\in L$. But then $f(x_n) = 1$, so with $T$ as in our definition of $f$, $T \cup \{x_n\} \in L$. Thus every finite subset of this is in $L$, and in particular $\{x_1, \ldots, x_n\} \in L$, contradicting our assumption.

Now we show it’s maximal: Suppose $S \cup \{x\} \in L$. Then we must have $f(x) = 1$, because $T \cup \{x\} \subseteq S \cup \{x\}$, and families of sets of finite character are closed under taking subsets. Thus $x \in S$.

QED

Slightly further adaptation allows us to prove Zorn’s lemma itself via this construction, and I think it actually is a slightly nicer proof than the one I presented before:

Let $T$ be some partially ordered set with relation $\prec$ such that every chain in $T$ has an upper bound.

Let $<$ be a well-order on $T$ (not necessarily compatible with $\prec$$.

Define $f : T \\to \\{0, 1\\}$ as $f(x) = 1$ if $y \\prec x$ for every $y \< x$ such that $f(y) = 1$, else $f(x) = 0$.

Let $C = \\{x: f(x) = 1\\}$.

Claim: $C$ is a chain for $\\prec$.

Proof: Let $x, y \\in C$ with $x \\neq y$. By reordering if necessary we can assume $x \< y$. Then by construction of $f$, we must have $x \\prec y$.

Thus, by assumption, there is some upper bound for $C$, call it $s$.

Claim: $s$ is maximal.

Proof: We will show that $s \\in C$. This proves that $s$ is maximal because if there were another element $t \> s$, this would also be an upper bound, which would mean it will also have to be in $C$, which contradicts that $s$ is an upper bound for $C$.

To show that $s$ is in $C$: We must have $f(s) = 1$, because $\{y: y < x, f(y) = 1 \} \subseteq C\}$, $y \\preceq s$ for all $y \\in C$, and $y \\neq s$ for $y \< s$. These are the conditions in our definition of $f$, so $f(s) = 1$ and thus $s \\in C$.

QED

Side note 1: Note that all we’ve really done in the above proof is inlined a proof of the Hausdorff maximal principle. This follows fairly straightforwardly from the Teichmüller–Tukey lemma, so having its own proof by this mechanism is not actually super useful.

Side note 2: You may have only seen the well-ordering theorem proven using Zorn’s lemma so this may seem circular, but you can prove it directly from Hartog’s lemma (which does not require the axiom of choice) as follows:

Let $X$ be some set, and let $c$ be a choice function for subsets of $X$. Let $W$ be some well ordered set that does not inject into $X$. Define $f : W \\to X \\cup \\{\\infty\\}$ as:

If $X \\subseteq \\{ f(s) : s \< t \\}$ then $f(t) = \\infty$. Else, $f(t) = c(X \\setminus \\{ f(s) : s \< t \\})$.

This must eventually hit $\\infty$ as otherwise it would be injective. Restricted to $\\{t: f(t) \\neq \\infty\\}$, f gives a bijection between $X$ and a well-ordered set, so you can use this to define a well-ordering on $X$.

QED

Is this style of proof really any better than the alternatives? I don’t know. You probably still want to use Zorn’s lemma and similar most of the time anyway. I like these proofs though, and they seem like a slightly less messy way to get a feel for transfinite induction.