Definition:Class Interval
It has been suggested that this page or section be merged into Definition:Bin. To discuss this page in more detail, feel free to use the talk page. When this work has been completed, you may remove this instance of {{Mergeto}} from the code. |
Definition
Let $D$ be a finite set of $n$ observations of a quantitative variable.
Integer Data
Let the data in $D$ be described by integers.
Let $d_{\min}$ be the value of the smallest datum in $D$.
Let $d_{\max}$ be the value of the largest datum in $D$.
Let $P = \set {x_0, x_1, x_2, \ldots, x_{n - 1}, x_n} \subseteq \Z$ be a subdivision of $\closedint a b$, where $a \le x_0 \le x_n \le b$.
The integer interval $\closedint a b$, where $a \le d_{\min} \le d_\max \le b$, is said to be divided into class intervals of integer intervals of the forms $\closedint {x_i} {x_{i + 1} }$ or $\closedint {x_i} {x_i}$ if and only if:
- Every datum is assigned into exactly one class interval
- Every class interval is disjoint from every other class interval
- The union of all class intervals contains the entire integer interval $\closedint {x_0} {x_n}$
By convention, the first and last class intervals are not empty class intervals.
Real Data
Let the data in $D$ be described by rational numbers or by real numbers.
Let $d_{\min}$ be the value of the smallest datum in $D$.
Let $d_{\max}$ be the value of the largest datum in $D$.
Let $P = \set {x_0, x_1, x_2, \ldots, x_{n - 1}, x_n} \subseteq \R$ be a subdivision of $\closedint a b$, where $a \le x_0 \le x_n \le b$.
The closed real interval $\closedint a b$, where $a \le d_{\text {min}} \le d_{\text {max}} \le b$, is said to be divided into class intervals of real intervals with endpoints $x_i$ and $x_{i + 1}$ if and only if:
- Every datum is assigned into exactly one class interval
- Every class interval is disjoint from every other class interval
- The union of all class intervals contains the entire real interval $\closedint {x_0} {x_n}$
The class intervals may be any combination of open, closed, or half-open intervals that fulfill the above criteria, but usually:
- Every class interval except the last is of the form $\closedint {x_i} {x_{i + 1} }$
- The last class interval is of the form $\closedint {x_{n - 1} } {x_n}$
By convention, the first and last class intervals are not empty class intervals.
Boundary of Class Interval
The class boundaries of a class interval are the endpoints of the integer interval or real interval which defines the class interval.
Class Mark
A class mark is a value within a class interval used to identify that class interval uniquely.
It is usual to use the midpoint.
Empty Class Interval
A class interval is empty if and only if it is of frequency zero.
Relative Sizes of Class Interval
For a given problem domain, it is not necessary for all class intervals to be the same length.
However, it is often the case that they are all the same length, as that can make analysis more convenient.
Also known as
A class interval is also known as a class, but as this has more than one meaning in mathematics, class interval is preferred on $\mathsf{Pr} \infty \mathsf{fWiki}$.
Comment
This page or section has statements made on it that ought to be extracted and proved in a Theorem page. In particular: This section needs to be superseded by a page stating the case precisely You can help $\mathsf{Pr} \infty \mathsf{fWiki}$ by creating any appropriate Theorem pages that may be needed. To discuss this page in more detail, feel free to use the talk page. |
It is often the case that rational data are presented in decimal notation with a small and uniform number of digits for each datum.
In such cases the data may be artificially treated as integer data by "ignoring" the decimal point when creating the classes.
Although this article appears correct, it's inelegant. There has to be a better way of doing it. In particular: Do not gloss over the fact that integer data and real data are QUALITATIVELY different. "Integer data" represents a count, while "real data" represents a measurement. Even though a measurement may be rounded to an integer, it is not the same as data which can genuinely be understood as a "how many" question. In cases where "how many" numbers are extremely large (e.g. human population counts) the distinction is blurred as the classes tend to have boundaries in the 1000s or 1 000 000s, and so the practicalities of the distinction are less important. But the distinction is there. You can help $\mathsf{Pr} \infty \mathsf{fWiki}$ by redesigning it. To discuss this page in more detail, feel free to use the talk page. When this work has been completed, you may remove this instance of {{Improve}} from the code.If you would welcome a second opinion as to whether your work is correct, add a call to {{Proofread}} the page. |
Examples
Arbitrary Example
Consider the problem domain of employee wages.
We may count the number of employees earning a wage in each of the following ranges:
- between $\pounds 0.00$ and $\pounds 199.99$
- between $\pounds 200.00$ and $\pounds 299.99$
- between $\pounds 300.00$ and $\pounds 399.99$
Hence the class intervals in question are:
- $\closedint {0.00} {199.99}$
- $\closedint {200.00} {299.99}$
- $\closedint {300.00} {399.99}$
Hence the number of employees in each class interval gives the class frequency of that interval.
Also see
- Definition:Category (Descriptive Statistics): a similar concept for a qualitative variable
- Results about class intervals can be found here.
Sources
- 1998: David Nelson: The Penguin Dictionary of Mathematics (2nd ed.) ... (previous) ... (next): class intervals
- 2008: David Nelson: The Penguin Dictionary of Mathematics (4th ed.) ... (previous) ... (next): class intervals
- 2011: Charles Henry Brase and Corrinne Pellillo Brase: Understandable Statistics (10th ed.): $\S 2.1$
- 2014: Christopher Clapham and James Nicholson: The Concise Oxford Dictionary of Mathematics (5th ed.) ... (previous) ... (next): class interval