Skolem's Paradox

Theorem
Let $\mathcal{L}$ be a countable first-order language, and let $T$ be an $\mathcal{L}$-theory which axiomatizes some version of set theory (for example, ZFC).

There is a countable model of $T$.

Why is this called a paradox?
Skolem observed that despite this theorem, within versions of set theory such as ZFC, statements can be proven that are interpreted as asserting the existence of uncountable sets. Since the theorem above guarantees countable models of ZFC, this presents a seemingly impossible state of affairs:


 * Theorem 1) A model of ZFC exists whose universe is countable.
 * Theorem 2) It is a theorem in ZFC that there exists an uncountable set.
 * Conclusion) There is a model of ZFC which is countable but has an uncountable subset.

This argument is flawed however, which Skolem understood. The problem with this argument is that it commits something of an equivocation fallacy. The resolution of this apparent problem lies in paying careful attention to what is actually being said.

First, note that when we discuss first-order formalizations of set theory, we typically are using a language consisting of a single binary relation symbol written as $\in$. This is also the symbol we typically use in mathematics to denote set membership. To help make discussion easier, we will instead use the language $\mathcal{L} = \{\varepsilon\}$. This will allow us to continue using $\in$ without confusion when we mean the natural set membership relation. This is an important distinction: a large part of resolving Skolem's paradox is becoming aware of the different usages of the notion of membership occurring. As a side note, the particular choice of language is not important for understanding Skolem's Paradox and its resolution, so long as we stick to countable first-order ones.


 * Understanding what Theorem 1 (the theorem on this page) in the above argument states:

First, recall that a model of $\mathrm{ZFC}$ in the language $\mathcal{L}=\{\varepsilon\}$ is by definition a pair $\mathcal{M} = (M, \varepsilon_\mathcal{M})$, where $M$ is a set referred to as the universe of $\mathcal{M}$ and $\varepsilon_\mathcal{M}$ is a binary relation on $M$ referred to as the interpretation of the relation symbol $\varepsilon$, such that the axioms of $\mathrm{ZFC}$ are satisfied when interpreted with their variables ranging over $M$ and using $\in_\mathcal{M}$ for the relation symbol $\in$.

As an example, the axiom asserting the existence of an empty set would be written as an $\mathcal{L}$-sentence as $\exists x \forall y (\neg y \varepsilon x)$. For $\mathcal{M}$ to be a model of $\mathrm{ZFC}$, then in particular this axiom must be satisfied by $\mathcal{M}$. This is the case if there exists an $x\in M$ such that for all $y\in M$, it is false that $y \varepsilon_\mathcal{M} x$. Again, note the different usages of $\in$, $\varepsilon$, and $\varepsilon_\mathcal{M}$.

With this in mind, Theorem 1 says:


 * Theorem 1) With the axioms of $\mathrm{ZFC}$ written as sentences in the language $\mathcal{L} = \{\varepsilon\}$, there exists an $\mathcal{L}$-structure $\mathcal{M} = (M, \varepsilon_\mathcal{M})$ which models $\mathrm{ZFC}$ such that there exists a bijection between $\N$ and the universe $M$ of $\mathcal{M}$.


 * Understanding Theorem 2:

What we mean when we say there is a theorem in ZFC which says that there is an uncountable set is that there is an $\mathcal{L}=\{\varepsilon\}$ statement deducible from the axioms of ZFC such as
 * $\displaystyle \exists x \left(\neg\exists f \left(\mathrm{Function}(f) \land \mathrm{Bijection}(f) \land \mathrm{Domain}(f)=x \land \mathrm{Range}(f) = \N\right)\right)$

where several component formulas have been abbreviated with names.

We often read this in natural English language as "there exists a set $x$ such that there is no bijection between $x$ and $\N$".

By the soundness theorem for first-order logic, knowing that ZFC has a theorem means that all models of ZFC must satisfy that sentence. But, as mentioned above, this means that the sentence is true when variables range over the universe of the model and when the relation symbol $\varepsilon$ is interpreted as the relation provided by the model.

So, rather importantly, all this tells us is that since $\mathcal{M}=(M,\varepsilon_\mathcal{M})$ is a model of $\mathrm{ZFC}$, it must be the case that there is some $x\in M$ for which there is no $f \in M$ such that $x$ and $f$ satisfy the $\mathcal{L}$-formulas demanded of them.

The important point here is that $\mathcal{M}$ can be a model by failing to contain an $f$ with certain properties.

This observation is enough to overcome the apparent paradox, but also worth noting is that if the part of the sentence abbreviated as $\mathrm{Function}(f) \land \mathrm{Bijection}(f) \land \mathrm{Domain}(f)=x \land \mathrm{Range}(f) = \N$ were true in $\mathcal{M}$ about some $x, f \in M$, it would not necessarily mean that $f$ is "really" a bijective function with domain $x$ and range $\N$, since, as far as $\mathcal{M}$ is concerned, these are formulas which say things about its universe $M$ and its relation $\varepsilon_\mathcal{M}$. So, it is more correct to speak about these things in relative terms: for example $f$ would be a "function relative to $(M, \varepsilon_\mathcal{M})$".

In short, Theorem 2 in the fallacious argument above only tells us that a model's universe must not contain something it interprets as a bijective function with certain properties using its provided relation; it does not tell us that actual bijections can't exist outside of the universe of the model.

With this in mind (and applying the soundness theorem), Theorem 2 should be stated as:


 * Theorem 2) If $\mathcal{M}=(M, \varepsilon_\mathcal{M})$ is a model of $\mathrm{ZFC}$, then there exists an $x \in M$ for which there is no $f \in M$ such that ...

It should now be easier to see that while Theorem 1 makes a statement about the existence of a bijection between the universe $M$ of a model $\mathcal{M}=(M, \varepsilon_\mathcal{M})$, Theorem 2 makes a statement about elements of $M$ and how they act using $\varepsilon_\mathcal{M}$. Stated this way, the conclusion no longer appears to follow from these theorems. The issue is perhaps summarized in the short note that "statements true from the internal perspective of a model are not necessarily true outside the model".

Proof
This is a straightforward application of the downward Löwenheim-Skolem Theorem.