Frequency Distribution Table
Using Descriptive statistics we can organize the data to get the general pattern of the data and check where data values tend to concentrate and try to expose extreme or unusual data values.
A frequency distribution is a compact form of data in a table which displays the categories of observations according to there magnitudes and frequencies such that the similar or identical numerical values are grouped together. The categories are also known as groups, class intervals or simply classes. The classes must be mutually exclusive classes showing the number of observations in each class. The number of values falling in a particular category is called the frequency of that category denoted by f.
A Frequency Distribution shows us a summarized grouping of data divided into mutually exclusive classes and the number of occurrences in a class. Frequency distribution is a way of showing a raw (ungrouped or unorganized) data into grouped or organized data to show results of sales, production, income, loan, death rates, height, weight, temperature etc.
The relative frequency of a category is the proportion of observed frequency to the total frequency obtained by dividing observed frequency by the total frequency and denoted by r.f. The sum of r.f. column should be one except for rounding error. Multiplying each relative frequency of class by 100 we can get percentage occurrence of a class. A relative frequency captures the relationship between a class total and the total number of observations.
The frequency distribution may be made for continuous data, discrete data and categorical data (for both qualitative and quantitative data). It can also be used to draw some graphs such as histogram, line chart, bar chart, pie chart, frequency polygon etc.
Steps to make a Frequency Distribution of data are:
- Decide about the number of classes. The number of classes usually between 5 and 20. Too many classes or too few classes might not reveal the basic shape of the data set, also it will be difficult to interpret such frequency distribution. The maximum number of classes may be determined by formula:
\[\text{Number of Classes} = C = 1 + 3.3 log (n)\]
\[\text{or} \quad C = \sqrt{n} \quad {approximately}\]where $n$ is the total number of observations in the data. - Calculate the range of the data (Range = Max – Min) by finding minimum and maximum data value. Range will be used to determine the class interval or class width.
- Decide about width of the class denote by h and obtained by
\[h = \frac{\text{Range}}{\text{Number of Classes}}= \frac{R}{C} \]
Generally the class interval or class width is the same for all classes. The classes all taken together must cover at least the distance from the lowest value (minimum) in the data set up to the highest (maximum) value. Also note that equal class intervals are preferred in frequency distribution, while unequal class interval may be necessary in certain situations to avoid a large number of empty, or almost empty classes. - Decide the individual class limits and select a suitable starting point of the first class which is arbitrary, it may be less than or equal to the minimum value. Usually it is started before the minimum value in such a way that the mid point (the average of lower and upper class limits of the first class) is properly placed.
- Take an observation and mark a vertical bar (|) for a class it belongs. A running tally is kept till the last observation. The tally counts indicates five.
- Find the frequencies, relative frequency, cumulative frequency etc. as required.
A frequency distribution is said to be skewed when its mean and median are different. The kurtosis of a frequency distribution is the concentration of scores at the mean, or how peaked the distribution appears if depicted graphically, for example, in a histogram. If the distribution is more peaked than the normal distribution it is said to be leptokurtic; if less peaked it is said to be platykurtic.