# Definition:Probability Mass Function

This page has been identified as a candidate for refactoring of advanced complexity.In particular: also deal with matters described in talkUntil this has been finished, please leave
`{{Refactor}}` in the code.
Because of the underlying complexity of the work needed, it is recommended that you do not embark on a refactoring task until you have become familiar with the structural nature of pages of $\mathsf{Pr} \infty \mathsf{fWiki}$.To discuss this page in more detail, feel free to use the talk page.When this work has been completed, you may remove this instance of `{{Refactor}}` from the code. |

It has been suggested that this page or section be merged into Definition:Probability Distribution.To discuss this page in more detail, feel free to use the talk page.When this work has been completed, you may remove this instance of `{{Mergeto}}` from the code. |

## Definition

Let $\struct {\Omega, \Sigma, \Pr}$ be a probability space.

Let $X: \Omega \to \R$ be a discrete random variable on $\struct {\Omega, \Sigma, \Pr}$.

Then the **probability mass function** of $X$ is the (real-valued) function $p_X: \R \to \closedint 0 1$ defined as:

$\quad \forall x \in \R: \map {p_X} x = \begin {cases} \map \Pr {\set {\omega \in \Omega: \map X \omega = x} } & : x \in \Omega_X \\ 0 & : x \notin \Omega_X \end {cases}$

where $\Omega_X$ is defined as $\Img X$, the image of $X$.

That is, $\map {p_X} x$ is the probability that the discrete random variable $X$ takes the value $x$.

$\map {p_X} x$ can also be written:

- $\map \Pr {X = x}$

Note that for any discrete random variable $X$, the following applies:

\(\ds \sum_{x \mathop \in \Omega_X} \map {p_X} x\) | \(=\) | \(\ds \map \Pr {\bigcup_{x \mathop \in \Omega_X} \set {\omega \in \Omega: \map X \omega = x} }\) | Definition of Probability Measure | |||||||||||

\(\ds \) | \(=\) | \(\ds \map \Pr \Omega\) | ||||||||||||

\(\ds \) | \(=\) | \(\ds 1\) |

The latter is usually written:

- $\ds \sum_{x \mathop \in \R} \map {p_X} x = 1$

Thus it can be seen by definition that a **probability mass function** is an example of a normalized weight function.

The set of **probability mass functions** on a finite set $Z$ can be seen denoted $\map \Delta Z$.

### Joint Probability Mass Function

Let $X: \Omega \to \R$ and $Y: \Omega \to \R$ both be discrete random variables on $\struct {\Omega, \Sigma, \Pr}$.

Then the **joint (probability) mass function** of $X$ and $Y$ is the (real-valued) function $p_{X, Y}: \R^2 \to \closedint 0 1$ defined as:

$\quad \forall \tuple {x, y} \in \R^2: \map {p_{X, Y} } {x, y} = \begin {cases} \map \Pr {\set {\omega \in \Omega: \map X \omega = x \land \map Y \omega = y} } & : x \in \Omega_X \text { and } y \in \Omega_Y \\ 0 & : \text {otherwise} \end {cases}$

That is, $\map {p_{X, Y} } {x, y}$ is the probability that the discrete random variable $X$ takes the value $x$ at the same time that the discrete random variable $Y$ takes the value $y$.

### General Definition

Let $X = \set {X_1, X_2, \ldots, X_n}$ be a set of discrete random variables on $\struct {\Omega, \Sigma, \Pr}$.

Then the **joint (probability) mass function** of $X$ is (real-valued) function $p_X: \R^n \to \closedint 0 1$ defined as:

- $\forall x = \tuple {x_1, x_2, \ldots, x_n} \in \R^n: \map {p_X} x = \map \Pr {X_1 = x_1, X_2 = x_2, \ldots, X_n = x_n}$

The properties of the two-element case can be appropriately applied.

## Examples

### Arbitrary Example

Consider a population consisting of children the state of whose teeth is being monitored.

The following table consists of a count of the number of teeth with dental caries in a group of $100$ schoolchildren:

$\quad \begin {array} {|l|l|} \hline \text {Number of Teeth} & \text {Number of Children} \\ \hline 0 & 53 \\ 1 & 29 \\ 2 & 14 \\ 3 & 1 \\ 4 & 3 \\ \hline \end {array}$

The values of the **probability mass function**, in this case better referred to as a **relative frequency function**, are:

\(\ds \map f 0\) | \(=\) | \(\ds 0 \cdotp 53\) | ||||||||||||

\(\ds \map f 1\) | \(=\) | \(\ds 0 \cdotp 29\) | ||||||||||||

\(\ds \map f 2\) | \(=\) | \(\ds 0 \cdotp 14\) | ||||||||||||

\(\ds \map f 3\) | \(=\) | \(\ds 0 \cdotp 01\) | ||||||||||||

\(\ds \map f 4\) | \(=\) | \(\ds 0 \cdotp 03\) |

## Also known as

A **probability mass function** is often seen abbreviated **p.m.f.**, **pmf** or **PMF**.

Some sources refer to it just as a **mass function**, or a **probability function**.

It is also known as a **frequency function**, which is also used for **probability density function**.

When used in the context of a set of raw data, it can also be called a **relative frequency function**.

## Also see

- Results about
**probability mass functions**can be found**here**.

## Sources

- 1986: Geoffrey Grimmett and Dominic Welsh:
*Probability: An Introduction*... (previous) ... (next): $\S 2.1$: Probability mass functions - 1991: Roger B. Myerson:
*Game Theory*... (previous) ... (next): $1.2$ Basic Concepts of Decision Theory - 1998: David Nelson:
*The Penguin Dictionary of Mathematics*(2nd ed.) ... (previous) ... (next):**frequency function** - 2008: David Nelson:
*The Penguin Dictionary of Mathematics*(4th ed.) ... (previous) ... (next):**frequency function** - 2014: Christopher Clapham and James Nicholson:
*The Concise Oxford Dictionary of Mathematics*(5th ed.) ... (previous) ... (next):**probability mass function**