Definition:Likelihood Function

Definition

Let $\FF$ be a type of probability distribution which has a parameter $\theta$.

Let $X$ be a continuous random variable belonging to a member of $\FF$.

Let the frequency function of $X$ be expressed as $\map f {x, \theta}$, where $x$ is variable and $\theta$ given.

Let $x_1$ be an observation of a variable from $X$.

Then $\theta$ can be regarded as a variable that can be varied so as to specify individual members of $\FF$.

The likelihood function $\map L \theta$ is then defined as:

$\map L \theta := \map f {x_1, \theta}$

regarded as a function of $\theta$ for a given $x_1$.

The usual symbol used to denote the likelihood function of a parameter $\theta$ is $\map {\mathrm L} \theta$.

The likelihood function $\map {\mathrm L} \theta$ is:

$\map {\mathrm L} \theta := \map f {x_1, x_2, \ldots, x_n}$

$\map {\mathrm L} \theta = \map f {x_1, \theta} \map f {x_2, \theta} \cdots,\map f {x_n, \theta}$

Suppose $\theta_1$ and $\theta_2$ are values of $\theta$.

Suppose that:

$\map {\mathrm L} {\theta_2} < \map {\mathrm L} {\theta_1}$

This implies that the sample has a smaller value of the joint frequency function if the unknown parameter is $\theta_2$ rather than $\theta_1$.

This in turn means that the sample is less likely to have come from a population where $\theta = \theta_2$ rather than where $\theta = \theta_1$.

This line of reasoning leads to the concept of maximum likelihood estimation.