Frequency Distribution - Statistics for Data Science & Analytics

A frequency table is a way of summarizing a set of data. It is a record of each value (or set of values) of the variable in the data/question. In this post, we will learn about the ways of Constructing Frequency Tables for discrete and continuous data.

A grouping of qualitative data into mutually exclusive classes, showing the number of observations in each class, is called a frequency table. The number of values falling in a particular category/class is called the frequency of that category/class, denoted by $f$.

If data of continuous variables are arranged into different classes with their frequencies, then this is known as a continuous frequency distribution. If data of discrete variables is arranged into different classes with their frequencies, then it is known as a discrete distribution or discontinuous distribution.

Discrete Frequency Distribution Table Example

Car Type	Number of Cars
Local	50
Foreign	30
Total Cars	80

Constructing Frequency Tables

Constructing Frequency tables (distributions) may be done for both discrete and continuous variables. A discrete frequency distribution can be converted back to the original values, but for continuous variables, it is not possible.

Step-by-Step Procedure of Constructing a Frequency Table

The following steps are taken into account while constructing frequency tables for continuous data.

Calculate the range of the data. The range is the difference between the highest and lowest values of the given data.
\[Range = Highest\,\, Value – Lowest\,\, Value\]
Decide the number of Classes. The maximum number of classes may be determined by the formula
Number of classes $C = 2^k$ OR Number of classes $(C) = 1+3.3\, log (n)$
Note that: Too many classes or too few classes might not reveal the basic shape of the data set.
Determine the Class Interval or Width
The class all taken together should cover at least the distance from the lowest value in the data up to the highest value, which can be done by this formula \[I=\frac{Highest\,\, Value – Lowest\,\, Value}{Number\,\, of \,\,Classes}=\frac{H-L}{K}\]
Where $I$ is the class interval, $H$ is the highest observed value, $L$ is the lowest observed value, and $K$ is the number of classes.
Generally, the class interval or width should be the same for all classes.
In particular interval size is usually rounded up to some convenient number, such as a multiple of 10 or 100. Unequal class intervals present problems in graphically portraying the distribution and in doing some of the computations. Unequal class intervals may be necessary for certain situations, such as to avoid a large number of empty or almost empty classes.
Set the Individual Class Limits
Class limits are the endpoints of the class interval. State clear class limits so that you can put each of the observations into one and only one category, i.e., you must avoid overlapping or unclear class limits. Class intervals are usually rounded up to get a convenient class size and cover a larger-than-necessary range.
It is convenient to choose the endpoints of the class interval so that no observation falls on them. It can be obtained by expressing the endpoints to one more place of decimal than the observations themselves, i.e., limits are converted to class boundaries to achieve continuity in data.
Tally the Observations into the Classes
Count the Number of Items in each Class
The number of observations in each class I called the class frequency. Note that totaling the frequencies in each class must equal the total number of observations. After following these steps, we have organized the data into a tabulation form, which is called a frequency distribution, which can be used to summarize the pattern in the observation, i.e., the concentration of the data.

Note: Arranging/organizing the data into a tabulation or frequency distribution results in a loss of detailed information as the individuality of observations vanishes, i.e., in frequency distribution, we cannot pinpoint the exact value, and we cannot tell the actual lowest and highest values of the data. However, the lower limit of the largest class conveys essentially the same meaning. So, in constructing the frequency tables, the advantages of condensing the data into a more understandable and organized form are more than offset by this disadvantage.

Constructing Frequency Tables 2012

Table of Contents

Discrete Frequency Distribution Table Example

Constructing Frequency Tables

Step-by-Step Procedure of Constructing a Frequency Table

Further Reading

Table of Contents

Discrete Frequency Distribution Table Example

Constructing Frequency Tables

Step-by-Step Procedure of Constructing a Frequency Table

Further Reading

Share this: