Primary Decomposition Theorem

Theorem
Let $T:V \to V$ be a linear operator on some vector space $V$ over some field $K$ and let $p(x) \in K[x]$ be a polynomial such that $deg(p)\ge 1$ and $p(T)=0$ where $0$ is the zero operator on $V$. We know that $p(x)=c p_1(x)^{a_1} p_2(x)^{a_2} \cdots p_r(x)^{a_r}$ for some constants $c \in K \setminus \{ 0 \}$ and $a_1, a_2, \ldots, a_r, r \in \Z_{\ge 1}$ and some distinct irreducible monic polynomials $p_1(x), p_2(x), \ldots, p_r(x)$. The primary decomposition theorem then states the following :

i) $\ker(p_i(T)^{a_i})$ is a $T$-invariant subspace of $V$ for all $i=1,2,\ldots,r$

ii) $V=\displaystyle \bigoplus_{i=1}^r \ker(p_i(T)^{a_i})$

Proof of i)
Let $v \in \ker(p_i(T)^{a_i})$. Then,

This shows that $T(v)\in \ker(p_i(T)^{a_i})$ for all $v \in \ker(p_i(T)^{a_i})$, because $v$ was first arbitrary in $\ker(p_i(T)^{a_i})$. That is, $\ker(p_i(T)^{a_i})$ is a $T$-invariant subspace of $V$.

Proof of ii)
Proof by induction on r :

For all $r\in \Z_{\ge 1}$, let $P(r)$ be the proposition :
 * $V=\displaystyle \bigoplus_{i=1}^r \ker(p_i(T)^{a_i})$

In this proposition $V$ and $T$ are arbitrary and $p_i(x)^{a_i}$ ($i=1,2,\cdots,r$) must satisfy the hypotheses of the theorem.

Basis for the Induction
$P(1)$ is true, as $p(T)=c p_1(T)^{a_1}=0 \implies p_1(T)^{a_1}=0 \implies V=\ker(p_1(T)^{a_1})$

Induction Hypothesis

 * Now we need to show that $\forall k \in \Z_{\ge 2} : P(k-1) \implies P(k)$ by showing that under the assumption that $P(k-1)$ is true for some $k \in \Z_{\ge 2}$, $P(k)$ must also be true.

So this is our induction hypothesis :
 * $V=\displaystyle \bigoplus_{i=1}^{k-1} \ker(p_i(T)^{a_i})$ for some $k \in \Z_{\ge 2}$

In this induction hypothesis $V$ and $T$ are arbitrary and $p_i(x)^{a_i}$ ($i=1,2,\cdots,k-1$) must satisfy the hypotheses of the theorem.

Induction Step
Suppose we meet the hypotheses of the theorem when $r=k$. Then we have $p(x) \in K[x]$ a polynomial such that $deg(p)\ge 1$, $p(T)=0$ and $p(x)=c p_1(x)^{a_1} p_2(x)^{a_2} \cdots p_k(x)^{a_k}$. Define $W:=\ker(p_2(T)^{a_2} \cdots p_k(T)^{a_k})$.


 * First, $W$ is a $T$-invariant subspace of $V$. Indeed, let $w \in W$. Then,

This shows that $T(w)\in W$ for all $w \in W$, because $w$ was first arbitrary in $W$. That is, $W$ is a $T$-invariant subspace of $V$, so it makes sense to talk about the restriction $T \restriction_W$ of $T$ on $W$.


 * Now, define $q(x):=p_2(x)^{a_2} \cdots p_k(x)^{a_k}$. Let $w \in W$. Then,

Because $w$ was arbitrary in $W$, this shows that $q(T \restriction_W)=0$, where $0$ is the zero operator on $W$.

The hypotheses of the theorem are met and by the induction hypothesis, we have that $\displaystyle W=\bigoplus_{i=2}^{k} \ker(p_i(T \restriction_W)^{a_i})$.


 * Now we show that $\ker(p_i(T \restriction_W)^{a_i})=\ker(p_i(T)^{a_i})$ for all $i=2,\ldots,k$.

$(\subseteq)$ We have that $\ker(p_i(T \restriction_W)^{a_i})=\{ w \in W : p_i(T \restriction_W)^{a_i}(w)=0 \}=\{ w \in W : p_i(T)^{a_i}(w)=0 \} \subseteq \ker(p_i(T)^{a_i})=\{ v \in V : p_i(T)^{a_i}(v)=0 \}$ for all $i=2,\ldots,k$

$(\supseteq)$ We have that $\ker(p_i(T)^{a_i}) \subseteq W$ for all $i=2,\ldots,k$. Indeed, if $v \in \ker(p_j(T)^{a_j})$ where $j \in \{ 2,\cdots,k \}$, then

So, $v \in \ker(p_2(T)^{a_2} \cdots p_k(T)^{a_k})=W$ and $\ker(p_j(T)^{a_j}) \subseteq W$ because $v$ was arbitrary in $\ker(p_j(T)^{a_j})$. So, $v \in \ker(p_i(T)^{a_i}) \implies v \in W$ and so $p_i(T \restriction_W)^{a_i}(v)=p_i(T)^{a_i}(v)=0$ for all $i=2,\ldots,k$. Finally, this implies that $v \in \ker(p_i(T \restriction_W)^{a_i})$ and $\ker(p_i(T)^{a_i}) \subseteq \ker(p_i(T \restriction_W)^{a_i})$ for all $i=2,\ldots,k$ because $v$ was arbitrary in \$ker(p_i(T)^{a_i})$.

This shows that $\ker(p_i(T \restriction_W)^{a_i})=\ker(p_i(T)^{a_i})$ for all $i=2,\ldots,k$, so we conclude that :
 * $\displaystyle W=\bigoplus_{i=2}^{k} \ker(p_i(T)^{a_i})$


 * In order to show that $V=\displaystyle \bigoplus_{i=1}^{k} \ker(p_i(T)^{a_i})$, we equivalently show that :

a) $V=W+\ker(p_1(T)^{a_1})$

b) $W \cap \ker(p_1(T)^{a_1})=0$, where $0=\{0\}\subseteq V$

Notice that b) is equivalent to show that $\ker(p_1(T)^{a_1}) \cap \ker(p_i(T)^{a_i})=0$ for all $i=2,\ldots,k$. In particular, b) implies this last result, because $\ker(p_i(T)^{a_i}) \subseteq W$ for all $i=2,\ldots,k$.

a) We have that $g(x):=gcd(p_1(x)^{a_1},p_2(x)^{a_2} \cdots p_k(x)^{a_k})=1 \in K$. Indeed, $p_1(x)$ being an irreducible monic polynomial, either $g(x)=1$ or $g(x)=p_1(x)^{z}$ for some $1 \le z \le a_1$. But we know that the greatest common divisor of two polynomials divides both of these polynomials, so we must conclude that $g(x)=1$ because otherwise we would have $p_1(x)=p_i(x)$ for some $i \in \{ 2,\ldots,k \}$, a contradiction with the hypothesis that the $p_i(x)$ ($i=1,2,\ldots,k$) are all distinct.

So, by Bézout's identity for polynomials, we know that $\exists h(x), h_1(x) \in K[x]$ such that
 * $1=h_1(x)p_1(x)^{a_1}+h(x)p_2(x)^{a_2} \cdots p_k(x)^{a_k}$

Let $v \in V$. Then, evaluating the last polynomial equality at $T$, and evaluating the resulting operator at $v$, we obtain
 * $v=h_1(T)p_1(T)^{a_1}(v)+h(T)p_2(T)^{a_2} \cdots p_k(T)^{a_k}(v)$

Notice that $I(v)=v$ on the left hand side.

But,

So, $h(T)p_2(T)^{a_2} \cdots p_k(T)^{a_k} \in \ker(p_1(T)^{a_1})$.

Similarly, we find that $p_2(T)^{a_2}p_3(T)^{a_3} \cdots p_k(T)^{a_k}(h_1(T)p_1(T)^{a_1}(v))=0$ and so $h_1(T)p_1(T)^{a_1}(v) \in \ker(p_2(T)^{a_2}p_3(T)^{a_3} \cdots p_k(T)^{a_k})=W$.

Because $v$ was arbitrary in $V$, we may conclude that
 * $v=h_1(T)p_1(T)^{a_1}v+h(T)p_2(T)^{a_2} \cdots p_k(T)^{a_k}v \implies V=W+\ker(p_1(T)^{a_1})$

b) Define $S:=W \cap \ker(p_1(T)^{a_1})$, $M:=\{ f(x) \in K[x] : f(T)(s)=0 \forall s \in S \}$ and $q(x):=gcd(M)$.

Because $p_2(x)^{a_2} \cdots p_k(x)^{a_k} \in M$ (see the second result of this section) and $p_1(x)^{a_1} \in M$ (every element of $S$ is also an element of $\ker(p_1(T)^{a_1})$), we see that
 * $q(x)|gcd(p_1(x)^{a_1},p_2(x)^{a_2} \cdots p_k(x)^{a_k})=g(x)=1$

So, $q(x)=1$

A generalized version of Bézout's identity then assures us that $\displaystyle q(x)=1=gcd(M)=\sum_{i=1}^{l} a_i(x)f_i(x)$ for some $l \in \Z_{\ge 1}$, $f_i(x) \in M$ and $a_i(x) \in K[x]$. Let $s \in S$. Then, evaluating the last polynomial equality at $T$, and evaluating the resulting operator at $s$, we obtain, by definition of $M$ and by noticing that the left hand side then becomes $I(s)=s$,
 * $\displaystyle s=\sum_{i=1}^{l} a_i(T)(f_i(T)(s))=0$

Because $s$ was arbitrary in $S$, it follows that $S=W \cap \ker(p_1(T)^{a_1})=0$.

Remarks

 * $V$ need not be of finite dimension, but if it is then we can see $T$ as its matrix in the equations and the notation $T^j$ is then directly related to matrix multiplication.


 * By definition, when applying $p(x)$ at $T$, one needs to convert constants which may appear in the expression of $p(x)$ into transformations in the following way : for example, if $p(x)=2+x$, then $p(T)=2I+T$, where $I$ is the identity application $I:V \to V$.


 * $p_1(x), p_2(x), \ldots, p_r(x)$ are not necessarily of degree 1 : for example, if $K=\R$ and $p(x)=x^2+1$, then $p(x)$ is irreducible on $\R$ and $p_1(x)=p(x)$ is of degree 2.

\overbrace{T \circ T \circ \cdots \circ T}^{\text{$j$ times}} & : j\in \Z_{\ge 1}\\ I:V \to V & : j=0 \end{cases}$
 * $T^j \triangleq \begin{cases}


 * By definition, the greatest common divisor of two polynomials is a monic polynomial.