Using **Descriptive statistics** we can organize the data to get the general pattern of the data and check where data values tend to concentrate and try to expose extreme or unusual data values. Let us start learning about the Frequency Distribution Table and its construction.

A frequency distribution is a compact form of data in a table that displays the categories of observations according to their magnitudes and frequencies such that similar or identical numerical values are grouped. The *categories* are also known as *groups, class intervals*, or simply *classes.* The classes must be * mutually exclusive classes* showing the number of observations in each class. The number of values falling in a particular category is called the frequency of that category denoted by $f$.

A **Frequency Distribution Table** shows us a summarized grouping of data divided into * mutually exclusive classes *and the number of occurrences in a class.

**Frequency distribution**is a way of showing raw (ungrouped or unorganized) data into grouped or organized data to show results of sales, production, income, loan, death rates, height, weight, temperature, etc.

The * relative frequency* of a category is the

*proportion*of observed frequency to the total frequency obtained by dividing

*observed frequency*by the

*total frequency*and denoted by $r.f.$. The sum of r.f. column should be one except for rounding errors. Multiplying each

*of class by 100 we can get the percentage occurrence of a class. A*

**relative frequency***captures the relationship between a class total and the total number of observations.*

**relative frequency**The **Frequency Distribution Table** may be made for *continuous* data, *discrete* data, and categorical data (for both qualitative and quantitative data). It can also be used to draw some graphs such as histograms, line charts, bar charts, pie charts, frequency polygons, Pareto Charts, Scatter diagrams, stem and leaf displays, etc.

### Steps of Creating Frequency Distribution Table

- Decide about the number of classes. The number of classes is usually between 5 and 20. Too many classes or too few classes might not reveal the basic shape of the data set, also it will be difficult to interpret such frequency distribution. The maximum number of classes may be determined by the formula:

\[\text{Number of Classes} = C = 1 + 3.3 log (n)\]

\[\text{or} \quad C = \sqrt{n} \quad {approximately}\]where $n$ is the total number of observations in the data. - Calculate the range of the data ($Range = Max – Min$) by finding minimum and maximum data values. The range will be used to determine the class interval or class width.
- Decide about the width of the class denoted by h and obtained by

\[h = \frac{\text{Range}}{\text{Number of Classes}}= \frac{R}{C} \]

Generally, the class interval or class width is the same for all classes. The classes all taken together must cover at least the distance from the lowest value (minimum) in the data set up to the highest (maximum) value. Also note that equal class intervals are preferred in frequency distribution, while unequal class intervals may be necessary in certain situations to avoid a large number of empty, or almost empty classes. - Decide the individual class limits and select a suitable starting point for the first class which is arbitrary, it may be less than or equal to the minimum value. Usually, it is started before the minimum value in such a way that the midpoint (the average of lower and upper-class limits of the first class) is properly placed.
- Take an observation and mark a vertical bar (|) for a class it belongs. A running tally is kept till the last observation. The tally counts indicate five.
- Find the frequencies, relative frequency, cumulative frequency, etc. as required.

A **frequency distribution** is said to be *skewed* when its mean and median are different. The *kurtosis* of a * frequency distribution* is the concentration of scores at the mean, or how peaked the distribution appears if depicted graphically, for example, in a histogram. If the distribution is more peaked than the

*normal*distribution it is said to be

*leptokurtic*; if less peaked it is said to be

*platykurtic.*

### Further Reading: Frequency Distribution Table

- http://mathworld.wolfram.com/FrequencyDistribution.html
- https://en.wikipedia.org/wiki/Frequency_distribution
- https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch8/5214814-eng.htm

Learn R Language: R Frequently Asked Questions