Variance as Expectation of Square minus Square of Expectation

Theorem

Let $X$ be a discrete random variable.

Then the variance of $X$ can be expressed as:

$\operatorname{var} \left({X}\right) = E \left({X^2}\right) - \left({E \left({X}\right)}\right)^2$

That is, it is the expectation of the square of $X$ minus the square of the expectation of $X$.

Proof

We let $\mu = E \left({X}\right)$, and take the expression for variance:

$\displaystyle \operatorname{var} \left({X}\right) := \sum_{x \mathop \in \operatorname{Im} \left({X}\right)} \left({x - \mu}\right)^2 \Pr \left({X = x}\right)$

Then:

 $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle \operatorname{var} \left({X}\right)$$ $$=$$ $$\displaystyle$$  $$\displaystyle$$ $$\displaystyle \sum_x \left({x^2 - 2 \mu x + \mu^2}\right) \Pr \left({X = x}\right)$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$=$$ $$\displaystyle$$  $$\displaystyle$$ $$\displaystyle \sum_x x^2 \Pr \left({X = x}\right) - 2 \mu \sum_x x \Pr \left({X = x}\right) + \mu^2 \sum_x \Pr \left({X = x}\right)$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$=$$ $$\displaystyle$$  $$\displaystyle$$ $$\displaystyle \sum_x x^2 \Pr \left({X = x}\right) - 2 \mu \sum_x x \Pr \left({X = x}\right) + \mu^2$$ $$\displaystyle$$ $$\displaystyle$$ Definition of Probability Mass Function: $\sum_x \Pr \left({X = x}\right) = 1$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$=$$ $$\displaystyle$$  $$\displaystyle$$ $$\displaystyle \sum_x x^2 \Pr \left({X = x}\right) - 2 \mu^2 + \mu^2$$ $$\displaystyle$$ $$\displaystyle$$ Definition of Expectation: $\sum_x x \Pr \left({X = x}\right) = \mu$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$\displaystyle$$ $$=$$ $$\displaystyle$$  $$\displaystyle$$ $$\displaystyle \sum_x x^2 \Pr \left({X = x}\right) - \mu^2$$ $$\displaystyle$$ $$\displaystyle$$

Hence the result, from $\mu = E \left({X}\right)$.

$\blacksquare$

Comment

This is a significantly more convenient way of defining the variance than the first-principles version. In particular, it is far easier to program a computer to calculate this (you don't need to maintain a record of all the divergences). Therefore, this is by far the more usually encountered of the definitions for variance.