# Definition:Class (Descriptive Statistics)

## Definition

Let $D$ be a finite collection of $n$ data regarding some quantitative variable.

## Integer Data

Let the data in $D$ be described by natural numbers or by integers.

Let $d_{\min}$ be the value of the smallest datum in $D$.

Let $d_{\max}$ be the value of the largest datum in $D$.

Let $P = \left\{{x_0, x_1, x_2, \ldots, x_{n-1}, x_n}\right\} \subseteq \Z$ be a subdivision of $\left[{a \,.\,.\, b}\right]$, where $a \le x_0 \le x_n \le b$.

The integer interval $\left[{a \,.\,.\, b}\right]$, where $a \le d_{\min} \le d_\max \le b$, is said to be divided into **classes** of integer intervals of the forms $\left[{x_i \,.\,.\, x_{i+1}}\right]$ or $\left[{x_i \,.\,.\, x_i}\right]$ if and only if:

- Every datum is assigned into exactly one class

- Every class is disjoint from every other

By convention, the first and last classes are not empty classes.

## Real Data

Let the data in $D$ be described by rational numbers or by real numbers.

Let $d_{\min}$ be the value of the smallest datum in $D$.

Let $d_{\max}$ be the value of the largest datum in $D$.

Let $P = \left\{{x_0, x_1, x_2, \ldots, x_{n-1}, x_n}\right\} \subseteq \R$ be a subdivision of $\left[{a \,.\,.\, b}\right]$, where $a \le x_0 \le x_n \le b$.

The closed real interval $\left[{a \,.\,.\, b}\right]$, where $a \le d_{\text{min}} \le d_{\text{max}} \le b$, is said to be divided into **classes** of real intervals with endpoints $x_i$ and $x_{i+1}$ if and only if:

- Every datum is assigned into exactly one class

- Every class is disjoint from every other

The classes may be any combination of open, closed, or half-open intervals that fulfill the above criteria, but usually:

- Every class except the last is of the form $\left[{x_i \,.\,.\, x_{i+1}}\right)$

- The last class is of the form $\left[{x_{n-1} \,.\,.\, x_n}\right]$

By convention, the first and last classes are not empty classes.

## Class Mark

The midpoint of a class is called the **class mark**.

## Empty Class

A class is **empty** if it is of frequency zero.

## Comment

It is often the case that rational data are presented in decimal notation with a small and uniform number of digits for each datum.

In such cases the data may be artificially treated as integer data by "ignoring" the decimal point when creating the classes.

## Sources

- 2011: Charles Henry Brase and Corrinne Pellillo Brase:
*Understandable Statistics*(10th ed.): $\S 2.1$