Creating Frequency Distribution Table

Using Descriptive statistics we can organize the data to get the general pattern of the data and check where data values tend to concentrate and try to expose extreme or unusual data values. Let us start learning about the Frequency Distribution Table and its construction.

Frequency and Frequency Distribution

A frequency distribution is a compact form of data in a table that displays the categories of observations according to their magnitudes and frequencies, such that similar or identical numerical values are grouped. The categories are also known as groups, class intervals, or simply classes. The classes must be mutually exclusive, showing the number of observations in each class. The number of values falling in a particular category is called the frequency of that category, denoted by $f$.

A Frequency Distribution Table shows us a summarized grouping of data divided into mutually exclusive classes and the number of occurrences in a class. Frequency distribution is a way of showing raw (ungrouped or unorganized) data into grouped or organized data to show results of sales, production, income, loan, death rates, height, weight, temperature, etc.

Relative Frequency

The relative frequency of a category is the proportion of observed frequency to the total frequency, obtained by dividing the observed frequency by the total frequency and denoted by r.f.  The sum of the RF column should be one, except for rounding errors. Multiplying each relative frequency of a class by 100, we can get the percentage occurrence of a class. A relative frequency captures the relationship between a class total and the total number of observations.

The Frequency Distribution Table may be made for continuous data, discrete data, and categorical data (for both qualitative and quantitative data). It can also be used to draw some graphs such as histograms, line charts, bar charts, pie charts, frequency polygons, Pareto Charts, Scatter diagrams, stem and leaf displays, etc.

Steps of Creating a Frequency Distribution Table

  1. Decide on the number of classes. The number of classes is usually between 5 and 20. Too many classes or too few classes might not reveal the basic shape of the data set, also it will be difficult to interpret such a frequency distribution. The maximum number of classes may be determined by the formula:
    \[\text{Number of Classes} = C = 1 + 3.3 log (n)\]
    \[\text{or} \quad C = \sqrt{n} \quad {approximately}\]where $n$ is the total number of observations in the data.
  2. Calculate the range of the data ($Range = Max – Min$) by finding the minimum and maximum data values. The range will be used to determine the class interval or class width.
  3. Decide about the width of the class denoted by h and obtained by
    \[h = \frac{\text{Range}}{\text{Number of Classes}}= \frac{R}{C} \]
    Generally, the class interval or class width is the same for all classes. The classes all taken together must cover at least the distance from the lowest value (minimum) in the data set to the highest (maximum) value. Also note that equal class intervals are preferred in frequency distribution, while unequal class intervals may be necessary in certain situations to avoid a large number of empty or almost empty classes.
  4. Decide the individual class limits and select a suitable starting point for the first class, which is arbitrary; it may be less than or equal to the minimum value. Usually, it is started before the minimum value in such a way that the midpoint (the average of the lower and upper-class limits of the first class) is properly placed.
  5. Take an observation and mark a vertical bar (|) for the class it belongs to. A running tally is kept till the last observation. The tally counts indicate five.
  6. Find the frequencies, relative frequency,  cumulative frequency, etc., as required.
Frequency Distribution Table
Frequency Distribution Table

A frequency distribution is said to be skewed when its mean and median are different. The kurtosis of a frequency distribution is the concentration of scores at the mean, or how peaked the distribution appears if depicted graphically, for example, in a histogram. If the distribution is more peaked than the normal distribution, it is said to be leptokurtic; if less peaked, it is said to be platykurtic.

Continuous Frequency Distribution Table

Further Reading: Frequency Distribution Table

Frequently Asked Questions

  • What is a frequency distribution table?
  • What is meant by mutually exclusive classes?
  • What is relative frequency?
  • What are the steps used for creating a frequency distribution table?

Learn R Language: R Frequently Asked Questions

Leave a Comment

Discover more from Statistics for Data Science & Analytics

Subscribe now to keep reading and get access to the full archive.

Continue reading