# Cook-Levin Theorem

## Proof

### The Boolean Satisfiability Problem is NP

Given a Boolean Satisfiability Problem with a set of variables $X$ and clauses $L$ and a possible solution to the problem, it is a trivial matter to evaluate all the clauses in $L$ to verify the solution in polynomial time.

### All NP-complete problems are polynomially reducible to the Boolean Satisfiability problem

The objective is, given a non-deterministic Turing Machine $M$ with $k$ internal states and an input $x$, to construct a set of variables and clauses that has a solution iff $M$ accepts the input $x$.

From the definition of NP we know that $M$ either accepts the input $x$ within $p \left({\left\vert{x}\right\vert}\right)$ steps, or it does not accept the input at all, where $p$ is some polynomial.

From Turing Machine cannot use More Squares of Memory than the Number of Steps that it Runs, we only need to concern ourselves with the first $p \left({\left\vert{x}\right\vert}\right)$ squares of memory.

Let $\Sigma$ denote the finite alphabet that the machine recognizes.

Let the variables for the Boolean Satisfiability Problem be given by:

Variable Interpretation Number
$q_{j n}$ $M$ is in state $j$ on step $n$ $k * p \left({\left\vert{x}\right\vert}\right)$
$T_{m \ \alpha \ n}$ The $m$th square contains the symbol $\alpha \in \Sigma$ on step $n$ $|\Sigma| * p \left({x}\right)^2$
$H_{m \ n}$ The $m$th square is being looked at during step $n$ $p \left({\left\vert{x}\right\vert}\right)^2$

Let the clauses for the Boolean Satisfiability Problem be given by:

Clause Interpretation Number Length
$q_{0 0}$ The Machine is in the initial state $\left({q_0}\right)$ on step 0. 1 Constant
$if \ m < \left\vert{x}\right\vert \ T_{m x_m 0}$ Let $x_m$ denote the $m$'th symbol of the input. This represents the initial state of the work tape. $\left\vert{x}\right\vert$ Constant
$H_{00}$ The Tape is in the initial position on step 0. 1 Constant
$if \ a \ne b \ q_{a n} \implies \neg q_{b n}$ Only one internal state at a time $\left({k^2 - k}\right) * n$ Constant
$if \ a \ne b \ T_{m \ a \ n} \implies \neg T_{m \ b \ n}$ Only one symbol per square at a time $\left({\left\vert{\Sigma}\right\vert^2 - \left\vert{\Sigma}\right\vert}\right) * p \left({\left\vert{x}\right\vert}\right)^2$ Constant
$if \ a \ne b \ H_{a n} \implies \neg H_{b n}$ Only one square is being looked at during any cycle. $p \left({\left\vert{x}\right\vert}\right)^2 - p \left({\left\vert{x}\right\vert}\right)$ Constant
$\neg H_{m \ n} \implies \left({T_{m \ \alpha \ n} = T_{m \ \alpha \ n+1} }\right)$ A square that was not looked at cannot change. $\left\vert{\Sigma}\right\vert * p \left({\left\vert{x}\right\vert}\right)^2$ Constant
$q_{accept \ p \left({\left\vert{x}\right\vert}\right)}$ $M$ is in the accepting state at the end of the computation. 1 Constant
$\left({q_{j \ n} \land T_{m \ \alpha \ n} \land H_{m \ n} }\right) \implies \bigvee \left({q_{l \ n + 1} \land T_{m \ \beta \ n + 1} \land H_{m \pm 1 \ n + 1} }\right)$ These clauses encode the rules of $M$ into logical expressions.

See the explanation below.

$k * \left\vert{\Sigma}\right\vert * p \left({\left\vert{x}\right\vert}\right)^2$ Varies (see below)

A production rule in a non-deterministic Turing machine can be written in the form:

$\left({q_a, \alpha}\right) \to \left({q_b, \beta, D_1}\right) \lor \left({q_c, \gamma, D_2}\right)$

meaning:

if the machine is in state $q_a$ and reading $\alpha$ on the tape,
either:
replace $\alpha$ with $\beta$
move one square in direction $D_1$ (either left or right)
change the internal state to $q_b$
or:
replace $\alpha$ with $\gamma$
move one square in the $D_2$ direction
go to internal state $q_c$.

If $D_1$ is left and $D_2$ is right then this rule would translate to:

$\left({q_a \land T_{m \ \alpha \ n} \land H_{m \ n} }\right) \implies \left({\left({q_b \land T_{m \ \beta \ n+1} \land H_{m-1 \ n+1} }\right) \lor \left({q_c \land T_{m \ \gamma \ n+1} \land H_{m+1 \ n+1}}\right)}\right)$

If the rule in the machine allows for more then two choices then this rule can be modified by adding more triplets to the right hand side of the implication rule.

The length of this clause is determined by the number of choices that $M$ gives for a given internal state and a given input.

Because this number of choices is bounded for any given machine, the total space used for this group of clauses is $O \left({p \left({\left\vert{x}\right\vert}\right)^2}\right)$.

In total, the size of the the Boolean Satisfiability problem is $O \left({p \left({ \left\vert{x}\right\vert }\right)^2 }\right)$, with the constant depending on $M$.

The conversion from a description of $M$ to the Boolean Satisfiability problem is straightforward and can be done in polynomial time.

The problem described has a solution if and only if $M$ accepts $x$ as its input.

All NP problems are polynomially reducible to the Boolean Satisfiability problem.

Therefore the Boolean Satisfiability is NP-hard.

The Boolean Satisfiability Problem is NP-complete.

$\blacksquare$

## Source of Name

This entry was named for Stephen Arthur Cook and Leonid Anatolievich Levin.

## Historical Note

The boolean satisfiability problem was thus the first problem known to be NP-complete.