Axiom of Choice implies Zorn's Lemma

Theorem
Acceptance of the Axiom of Choice implies the truth of Zorn's Lemma.

Statement of Zorn's Lemma
Let $\left({X, \preceq}\right), X \ne \varnothing$ be a non-empty ordered set such that every non-empty chain in $X$ has an upper bound in $X$.

Then $X$ has at least one maximal element.

Proof
For each $x \in X$, consider the weak initial segment $\bar s \left({x}\right)$:
 * $\bar s \left({x}\right) = \left\{{y \in X: y \preceq x}\right\}$

Let $\mathbb S \subseteq \mathcal P \left({X}\right)$ be the image of $\bar s$ considered as a mapping from $X$ to $P \left({X}\right)$, where $P \left({X}\right)$ is the power set of $X$.

From Ordering Equivalent to a Subset Relation:
 * $\forall x, y \in X: \bar s \left({x}\right) \subseteq \bar s \left({y}\right) \iff x \preceq y$

Thus the task of finding a maximal element of $X$ is equivalent to finding a maximal set in $\mathbb S$.

Thus the statement of the result is equivalent to a statement about chains in $\mathbb S$:


 * Let $\mathbb S$ be a non-empty subset of $P \left({X}\right), X \ne \varnothing$ such that every non-empty chain in $\mathbb S$, ordered by $\subseteq$, has an upper bound in $\mathbb S$.


 * Then $\mathbb S$ has at least one maximal set.

Let $\mathbb X$ be the set of all chains in $\left({X, \preceq}\right)$.

Every element of $X$ is included in $\bar s \left({x}\right)$ for some $x \in X$.

$\mathbb X$ is a non-empty set of sets which are ordered (perhaps partially) by subset.

If $\mathcal C$ is a chain in $\mathbb X$, then:
 * $\displaystyle \bigcup_{A \in \mathcal C} A \in \mathbb X$

Since each set in $\mathbb X$ is dominated by some set in $\mathbb S$, going from $\mathbb S$ to $\mathbb X$ can not introduce any new maximal elements.

The main advantage of using $\mathbb X$ is that the chain hypothesis is in a slightly more specific form.

Instead of saying that each chain in $\mathcal C$ has an upper bound in $\mathbb S$, we can explicitly state that the union of the sets of $\mathcal C$ is an element of $\mathbb X$.

This union of the sets of $\mathcal C$ is clearly an upper bound of $\mathcal C$.

Another advantage of $\mathbb X$ is that, from Subset of Toset is Toset, it contains all the subsets of each of its sets.

Thus we can embiggen non-maximal sets in $\mathbb X$ one element at a time.

So, from now on, we need consider only this non-empty collection $\mathbb X$ of subsets of a non-empty set $X$.

$\mathbb X$ is subject to two conditions:
 * $(1): \quad$ Every subset of each set in $\mathbb X$ is in $\mathbb X$.
 * $(2): \quad$ The union of each chain of sets in $\mathbb X$ is in $\mathbb X$.

It follows from $(1)$ that $\varnothing \in \mathbb X$.

We need to show that there exists a maximal set in $\mathbb X$.

Let $f$ be a choice function for $\mathbb X$:
 * $\forall A \in \mathbb X: f \left({A}\right) \in A$

For each $A \in \mathbb X$, let $\hat A$ be defined as:
 * $\hat A := \left\{{x \in X: A \cup \left\{{x}\right\} \in \mathbb X}\right\}$

That is, $\hat A$ consists of all the elements of $X$ which, when added to $A$, make a set which is also in $\mathbb X$.

From its definition:
 * $\displaystyle \hat A = \bigcup_{x \in \hat A} \left({A \cup \left\{{x}\right\}}\right)$

where each of $A \cup \left\{{x}\right\}$ are chains in $X$ and so elements of $\mathbb X$.

Hence it follows from $(2)$ above that $\hat A$ is a chain in $X$ and so an element of $\mathbb X$.

Hence, from $(1)$ above and Set Difference Subset it follows that $\hat A \setminus A$ is in the domain of $f$.

It follows that the mapping $g: \mathbb X \to \mathbb X$ may validly be defined as:
 * $\forall A \in \mathbb X: g \left({A}\right) = \begin{cases}

A \cup \left\{{f \left({\hat A \setminus A}\right)}\right\} & : \hat A \setminus A \ne \varnothing \\ A & : \text{otherwise} \end{cases}$

From the definition of $\hat A$, it follows that $\hat A \setminus A = \varnothing$ iff $A$ is maximal.

Thus what we now have to prove is that:
 * $\exists A \in \mathbb X: g \left({A}\right) = A$

Note that from the definition of $g$:
 * $\forall A \in \mathbb X: A \subseteq g \left({A}\right)$

The property of $g$ that is crucial is the fact that $g \left({A}\right)$ contains at most one more element than $A$.

We (temporarily) define a tower as being a subset $\mathcal T$ of $\mathbb X$ such that:
 * $(1): \quad \varnothing \in \mathcal T$
 * $(2): \quad A \in \mathcal T \implies g \left({A}\right) \in \mathcal T$
 * $(3): \quad $ If $\mathcal C$ is a chain in $\mathcal T$, then $\displaystyle \bigcup_{A \in \mathcal C} A \in \mathcal T$

There is of course at least one tower in $\mathbb X$, as $\mathbb X$ itself is one.

It follows from its definition that the intersection of a collection of towers is itself a tower.

It follows in particular that if $\mathcal T_0$ is the intersection of all towers in $\mathbb X$, then $\mathcal T_0$ is the smallest tower in $\mathbb X$.

Next we demonstrate that $\mathcal T_0$ is a chain.

We (temporarily) define a set $C \in \mathcal T_0$ as comparable if it is comparable with every element of $\mathcal T_0$.

That is, if $A \in \mathcal T_0$ then $C \subseteq A$ or $A \subseteq C$.

To say that $\mathcal T_0$ is a chain means that all sets of $\mathcal T_0$ are comparable.

There is at least one comparable set in $\mathcal T_0$, as $\varnothing$ is one of them.

So, suppose $C \in \mathcal T_0$ is comparable.

Let $A \in \mathcal T_0$ such that $A \subseteq C$.

Consider $g \left({A}\right)$.

Because $C$ is comparable, either $C \subsetneq g \left({A}\right)$ or $g \left({A}\right) \subseteq C$.

In the former case $A$ is a proper subset of a proper subset of $g \left({A}\right)$.

This contradicts the fact that $g \left({A}\right) \setminus A$ can be no more than a singleton.

Thus if such an $A$ exists, we have that:
 * $(A): \quad g \left({A}\right) \subseteq C$.

Now let $\mathcal U$ be the set defined as:
 * $\mathcal U := \left\{{A \in \mathcal T_0: A \subseteq C \lor g \left({C}\right) \subseteq A}\right\}$

Let $\mathcal U'$ be the set defined as:
 * $\mathcal U' := \left\{{A \in \mathcal T_0: A \subseteq g \left({C}\right) \lor g \left({C}\right) \subseteq A}\right\}$

That is, $\mathcal U'$ is the set of all sets in $\mathcal T_0$ which are comparable with $g \left({C}\right)$.

If $A \in \mathcal U$, then as $C \subseteq g \left({C}\right)$, either $A \subseteq g \left({C}\right) \lor g \left({C}\right) \subseteq A$

So $\mathcal U \subseteq \mathcal U'$.

The aim now is to demonstrate that $\mathcal U$ is a tower.

From Empty Set Subset of All, $\varnothing \subseteq C$.

Hence condition $(1)$ is satisfied.

Now let $A \in \mathcal U$.

As $C$ is comparable, there are three possibilities:
 * $(1'): \quad A \subsetneq C$

Then from $(A)$ above, $g \left({A}\right) \subseteq C$.

Therefore $g \left({A}\right) \in \mathcal U$.


 * $(2'): \quad A = C$

Then $g \left({A}\right) = g \left({C}\right)$ and so $g \left({C}\right) \subseteq g \left({A}\right)$.

Therefore $g \left({A}\right) \in \mathcal U$.


 * $(3'): \quad g \left({C}\right) \subseteq A$

Then $g \left({C}\right) \subseteq g \left({A}\right)$

Therefore $g \left({A}\right) \in \mathcal U$.

Hence condition $(2)$ is satisfied.

From the definition of $\mathcal U$, it follows immediately that the union of a chain in $\mathcal U$ is also in $\mathcal U$.

Hence condition $(3)$ is satisfied.

The conclusion is that $\mathcal U$ is a tower such that $\mathcal U \subseteq \mathcal T_0$.

But as $\mathcal T_0$ is the smallest tower, $\mathcal T_0 \subseteq \mathcal U$.

It follows that $\mathcal U = \mathcal T_0$.

Consider some comparable set $C$, then.

From that $C$ we can form $\mathcal U$, as above.

But as $\mathcal U = \mathcal T_0$:
 * $A \in \mathcal T_0 \implies \left({A \subseteq C \implies A \subseteq g \left({C}\right)}\right) \lor g \left({C}\right) \subseteq A$

and so $g \left({C}\right)$ is also comparable.

We now know that:
 * $\varnothing$ is comparable
 * the mapping $g$ maps comparable sets to comparable sets.

Since the union of a chain of comparable sets is itself comparable, it follows that the comparable sets all form a tower $\mathcal T_C$.

But by the nature of $\mathcal T_0$ it follows that $\mathcal T_0 \subseteq \mathcal T_C$.

So the elements of $\mathcal T_0$ must all be comparable.

Since $\mathcal T_0$ is a chain, the union $M$ of all the sets in $\mathcal T_0$ is itself a set in $\mathcal T_0$.

Since the union includes all the sets of $\mathcal T_0$, it follows that $g \left({M}\right) \subseteq M$.

Since it is always the case that $M \subseteq g \left({M}\right)$, it follows that $M = g \left({M}\right)$.

The result follows.