Conditional Entropy Decreases if More Given

From ProofWiki
Jump to navigation Jump to search

Theorem

Let $\struct {\Omega, \Sigma, \Pr}$ be a probability space.

Let $\AA, \CC, \DD \subseteq \Sigma$ be finite sub-$\sigma$-algebras.


Then:

$\CC \subseteq \DD \implies \map H {\AA \mid \CC} \ge \map H {\AA \mid \DD}$

where:

$\map H {\cdot \mid \cdot}$ denotes the conditional entropy


Proof

Consider the generated finite partitions:

$\xi := \map \xi \AA$
$\eta := \map \xi \CC$
$\gamma := \map \xi \DD$

By Generating Partition Preserves Order, $\CC \subseteq \DD$ implies:

$\eta \le \gamma$

where $\le$ denotes the refinement.

By Definition of Conditional Entropy of Finite Sub-Sigma-Algebra, we shall show:

$\map H {\xi \mid \eta} \ge \map H {\xi \mid \gamma}$


Recall that $\eta \le \gamma$ implies:

$\forall B \in \eta, \forall D \in \gamma \implies D \subseteq B \; \text{or} \; B \cap D = \O$


In particular, we have:

$(1): \quad \forall B \in \eta, \forall D \in \gamma: \map \Pr {B \cap D} = \begin{cases} \map \Pr D &: D \subseteq B \\ 0 &: B \cap D = \O \end{cases}$


Thus, for all $A \in \xi$ and $B \in \gamma$ such that $\map \Pr B > 0$:

\(\ds \dfrac {\map \Pr {A \cap B} } {\map \Pr B}\) \(=\) \(\ds \dfrac 1 {\map \Pr B} \sum_{\substack {D \mathop \in \gamma \\ D \mathop \subseteq B} } \map \Pr {A \cap D}\)
\(\ds \) \(=\) \(\ds \dfrac 1 {\map \Pr B} \sum_{\substack {D \mathop \in \gamma \\ \map \Pr D > 0} } \dfrac {\map \Pr {B \cap D} } {\map \Pr D} \map \Pr {A \cap D}\) apply $(1)$
\(\ds \) \(=\) \(\ds \sum_{\substack {D \mathop \in \gamma \\ \map \Pr D > 0} } \dfrac {\map \Pr {B \cap D} } {\map \Pr B} \dfrac {\map \Pr {A \cap D} } {\map \Pr D}\)

The function $\phi$ used in Definition of Conditional Entropy of Finite Partitions is concave since the second derivative is negative:

$\forall x > 0 : \map {\phi ' '} x = -\dfrac 1 x < 0$

Note that the concavity holds including $x = 0$, since $\phi$ is continuous there.


So we obtain:

\(\text {(2)}: \quad\) \(\ds \map \phi {\dfrac {\map \Pr {A \cap B} } {\map \Pr B} }\) \(\ge\) \(\ds \sum_{\substack {D \mathop \in \gamma \\ \map \Pr D > 0} } \dfrac {\map \Pr {B \cap D} } {\map \Pr B} \map \phi {\dfrac {\map \Pr {A \cap D} } {\map \Pr D} }\) Jensen's inequality
\(\ds \) \(=\) \(\ds \sum_{\substack {D \mathop \in \gamma \\ D \mathop \subseteq B \\ \map\Pr D > 0} } \dfrac {\map \Pr D} {\map \Pr B} \map \phi {\dfrac {\map \Pr {A \cap D} } {\map \Pr D} }\) apply $(1)$

Therefore:

\(\ds \map H {\xi \mid \eta}\) \(=\) \(\ds \sum_{\substack {B \mathop \in \eta \\ \map \Pr B \mathop > 0} } \map \Pr B \sum_{A \mathop \in \xi} \map \phi {\dfrac {\map \Pr {A \cap B} } {\map \Pr B} }\) Definition of Conditional Entropy of Finite Partitions
\(\ds \) \(\ge\) \(\ds \sum_{\substack {B \mathop \in \eta \\ \map \Pr B \mathop > 0} } \map \Pr B \sum_{A \mathop \in \xi} \sum_{\substack {D \mathop \in \gamma \\ D \mathop \subseteq B \\ \map\Pr D > 0} } \dfrac {\map \Pr D} {\map \Pr B} \map \phi {\dfrac {\map \Pr {A \cap D} } {\map \Pr D} }\) substituting from $(2)$
\(\ds \) \(=\) \(\ds \sum_{\substack {B \mathop \in \eta \\ \map \Pr B \mathop > 0} } \sum_{\substack {D \mathop \in \gamma \\ D \mathop \subseteq B \\ \map\Pr D > 0} } \sum _{A \mathop \in \xi} \map \Pr D \map \phi {\dfrac {\map \Pr {A \cap D} } {\map \Pr D} }\)
\(\ds \) \(=\) \(\ds \sum_{\substack {D \mathop \in \gamma \\ \map\Pr D > 0} } \sum _{A \mathop \in \xi} \map \Pr D \map \phi {\dfrac {\map \Pr {A \cap D} } {\map \Pr D} }\) since $\eta \le \gamma$
\(\ds \) \(=\) \(\ds \map H {\xi \mid \gamma}\) Definition of Conditional Entropy of Finite Partitions

$\blacksquare$


Corollary

$\map H \AA \ge \map H {\AA \mid \DD} $


Sources