Definition:Class (Descriptive Statistics)

From ProofWiki
Jump to: navigation, search

Definition

Let $D$ be a finite collection of $n$ data regarding some quantitative variable.


Integer Data

Let the data in $D$ be described by natural numbers or by integers.

Let $d_{\min}$ be the value of the smallest datum in $D$.

Let $d_{\max}$ be the value of the largest datum in $D$.

Let $P = \left\{{x_0, x_1, x_2, \ldots, x_{n-1}, x_n}\right\} \subseteq \Z$ be a subdivision of $\left[{a \,.\,.\, b}\right]$, where $a \le x_0 \le x_n \le b$.


The integer interval $\left[{a \,.\,.\, b}\right]$, where $a \le d_{\min} \le d_\max \le b$, is said to be divided into classes of integer intervals of the forms $\left[{x_i \,.\,.\, x_{i+1}}\right]$ or $\left[{x_i \,.\,.\, x_i}\right]$ if and only if:

Every datum is assigned into exactly one class
Every class is disjoint from every other
The union of all classes contains the entire integer interval $\left[{x_0 \,.\,.\, x_n}\right]$

By convention, the first and last classes are not empty classes.


Real Data

Let the data in $D$ be described by rational numbers or by real numbers.

Let $d_{\min}$ be the value of the smallest datum in $D$.

Let $d_{\max}$ be the value of the largest datum in $D$.

Let $P = \left\{{x_0, x_1, x_2, \ldots, x_{n-1}, x_n}\right\} \subseteq \R$ be a subdivision of $\left[{a \,.\,.\, b}\right]$, where $a \le x_0 \le x_n \le b$.


The closed real interval $\left[{a \,.\,.\, b}\right]$, where $a \le d_{\text{min}} \le d_{\text{max}} \le b$, is said to be divided into classes of real intervals with endpoints $x_i$ and $x_{i+1}$ if and only if:

Every datum is assigned into exactly one class
Every class is disjoint from every other
The union of all classes contains the entire real interval $\left[{x_0 \,.\,.\, x_n}\right]$


The classes may be any combination of open, closed, or half-open intervals that fulfill the above criteria, but usually:

Every class except the last is of the form $\left[{x_i \,.\,.\, x_{i+1}}\right)$
The last class is of the form $\left[{x_{n-1} \,.\,.\, x_n}\right]$

By convention, the first and last classes are not empty classes.


Class Mark

The midpoint of a class is called the class mark.


Empty Class

A class is empty if it is of frequency zero.


Comment

It is often the case that rational data are presented in decimal notation with a small and uniform number of digits for each datum.

In such cases the data may be artificially treated as integer data by "ignoring" the decimal point when creating the classes.


Sources