Basic Statistics - Statistics for Data Science & Analytics

Absolute Measure of Dispersion

Jul 6, 2025May 25, 2013 by Muhammad Imdad Ullah

An absolute Measure of Dispersion gives an idea about the amount of dispersion/ spread in a set of observations. These quantities measure the dispersion in the same units as the units of the original data. The absolute measure of dispersion cannot be used to compare the variation of two or more series/ data sets. The absolute measure of dispersion does not in itself tell whether the variation is large or small.

Absolute Measure of Dispersion

The absolute Measure of Dispersion:

Range
Quartile Deviation
Mean Deviation
Variance or Standard Deviation

Range

The Range is the difference between the largest value and the smallest value in the data set. For ungrouped data, let $X_0$ be the smallest value and $X_n$ be the largest value in a data set; then the range ($R$) is defined as
$R=X_n-X_0$.

For grouped data Range can be calculated in three different ways
R=Mid point of the highest class – Midpoint of the lowest class
R=Upper class limit of the highest class – Lower class limit of the lower class
R=Upper class boundary of the highest class – The lower class boundary of the lowest class

Quartile Deviation (Semi-Interquantile Range)

The Quartile deviation (an absolute measure of dispersion) is defined as the difference between the third and first quartiles, and half of this range is called the semi-interquartile range (SIQD) or simply quartile deviation (QD). $$QD=\frac{Q_3-Q_1}{2}$$

The Quartile Deviation is superior to the range as it is not affected by extremely large or small observations, it does not give any information about the position of observations lying outside the two quantities. It is not amenable to mathematical treatment and is greatly affected by sampling variability. Although Quartile Deviation is not widely used as a measure of dispersion, it is used in situations in which extreme observations are thought to be unrepresentative/ misleading. Quartile Deviation is not based on all observations; therefore, it is affected by extreme observations.

Note: The range “Median ± QD” contains approximately 50% of the data.

Mean Deviation (Average Deviation)

The Mean Deviation is another absolute measure of dispersion and is defined as the arithmetic mean of the deviations measured either from the mean or from the median. All these deviations are counted as positive to avoid the difficulty arising from the property that the sum of deviations of observations from their mean is zero.

$MD=\frac{\sum|X-\overline{X}|}{n}\quad$ for ungrouped data for mean
$MD=\frac{\sum f|X-\overline{X}|}{\sum f}\quad$ for grouped data for mean
$MD=\frac{\sum|X-\tilde{X}|}{n}\quad$ for ungrouped data for median
$MD=\frac{\sum f|X-\tilde{X}|}{\sum f}\quad$ for grouped data for median
Mean Deviation can be calculated about other central tendencies, but it is least when deviations are taken as the median.

The Mean Deviation gives more information than the range or the Quartile Deviation, as it is based on all the observed values. The Mean Deviation does not give undue weight to occasional large deviations, so it should likely be used in situations where such deviations are likely to occur.

Variance and Standard Deviation

This absolute measure of dispersion is defined as the mean of the squares of deviations of all the observations from their mean. Traditionally, population variance is denoted by $\sigma^2$ (sigma square) and for sample data is denoted by $S^2$ or $s^2$.

Symbolically
$\sigma^2=\frac{\sum(X_i-\mu)^2}{N}\quad$ Population Variance for ungrouped data
$S^2=\frac{\sum(X_i-\overline{X})^2}{n}\quad$ sample Variance for ungrouped data
$\sigma^2=\frac{\sum f(X_i-\mu)^2}{\sum f}\quad$ Population Variance for grouped data
$\sigma^2=\frac{\sum f (X_i-\overline{X})^2}{\sum f}\quad$ Sample Variance for grouped data

The variance is denoted by $Var(X)$ for a random variable $X$. The term variance was introduced by R. A. Fisher (1890-1982) in 1918. The variance is in squares of units, and the variance is a large number compared to the observations themselves.
Note that there are alternative formulas to compute Variance or Standard deviation.

The positive square root of the variance is called the standard Deviation (SD) to express the deviation in the same units as the original observation. It is a measure of the average spread about the mean and is symbolically defined as

$\sigma^2=\sqrt{\frac{\sum(X_i-\mu)^2}{N}}\quad$ Population Standard for ungrouped data
$S^2=\sqrt{\frac{\sum(X_i-\overline{X})^2}{n}}\quad$ Sample Standard Deviation for ungrouped data
$\sigma^2=\sqrt{\frac{\sum f(X_i-\mu)^2}{\sum f}}\quad$ Population Standard Deviation for grouped data
$\sigma^2=\sqrt{\frac{\sum f (X_i-\overline{X})^2}{\sum f}}\quad$ Sample Standard Deviation for grouped data
Standard Deviation is the most useful measure of dispersion and is credited with the name Standard Deviation by Karl Pearson (1857-1936).

In some text samples, Standard Deviation is defined as $S^2=\frac{\sum (X_i-\overline{X})^2}{n-1}$ based on the argument that knowledge of any $n-1$ deviations determines the remaining deviations as the sum of n deviations must be zero. This is an unbiased estimator of the population variance $\sigma^2$. The Standard Deviation has a definite mathematical measure; it utilizes all the observed values and is amenable to mathematical treatment, but is affected by extreme values.

References

R Language Tutorial

MCQs about Business Mathematics

Percentiles: Relative Standing

Jul 6, 2025Mar 10, 2013 by Muhammad Imdad Ullah

Percentiles are a measure of the relative standing of an observation within a dataset. Percentiles divide a set of observations into 100 equal parts, and percentile scores are frequently used to report results from national standardized tests such as the NAT, GAT, and GRE etc.

The $p$th percentile is the value $Y_{(p)}$ in order statistic such that $p$ percent of the values are less than the value $Y_{(p)}$ and $(100-p)$ (100-p) percent of the values are greater $Y_{(p)}$. The 5th percentile is denoted by $P_5$, the 10th by $P_{10}$ and 95th by $P_{95}$.

Percentiles for the Ungrouped Data

To calculate percentiles (a measure of the relative standing of an observation) for the ungrouped data, adopt the following procedure:

Order the observation
For the $m$th percentile, determine the product $\frac{m.n}{100}$. If $\frac{m.n}{100}$ is not an integer, round it up and find the corresponding ordered value and if $\frac{m.n}{100}$ is an integer, say k, then calculate the mean of the $K$th and $(k+1)$th ordered observations.

Ungrouped Data Example

For the following height data collected from students, find the 10th and 95th percentiles. 91, 89, 88, 87, 89, 91, 87, 92, 90, 98, 95, 97, 96, 100, 101, 96, 98, 99, 98, 100, 102, 99, 101, 105, 103, 107, 105, 106, 107, 112.

Solution: The ordered observations of the data are 87, 87, 88, 89, 89, 90, 91, 91, 92, 95, 96, 96, 97, 98, 98, 98, 99, 99, 100, 100, 101, 101, 102, 103, 105, 105, 106, 107, 107, 112.

\[P_{10}= \frac{10 \times 30}{100}=3\]

So the 10th percentile, i.e., $P_{10}$ is the 3rd observation in sorted data is 88, which means that 10 percent of the observations in the data set are less than 88.

\[P_{95}=\frac{95 \times 30}{100}=28.5\]

The 29th observation is our 95th Percnetile i.e., $P_{95}=107$

Percentiles for the Frequency Distribution Table (Grouped data)

The $m$th percentile (a measure of the relative standing of an observation) for the Frequency Distribution Table (grouped data) is

\[P_m=l+\frac{h}{f}\left(\frac{m.n}{100}-c\right)\]

Like median, $\frac{m.n}{100}$ is used to locate the $m$th percentile group.

$l$    is the lower class boundary of the class containing the $m$th percentile
$h$   is the width of the class containing $P_m$
$f$    is the frequency of the class containing
$n$   is the total number of frequencies $P_m$
$c$    is the cumulative frequency of the class immediately preceding the class containing $P_m$

Note that the 50th percentile is the median by definition, as half of the values in the data are smaller than the median and half of the values are larger than the median. Similarly, the 25th and 75th percentiles are the lower ($Q_1$) and upper quartiles ($Q_3$), respectively. The quartiles, deciles, and percentiles are also called quantiles or fractiles.

Percentiles: Measure of Relative Standing

Grouped Data Example

For the following grouped data compute $P_{10}$, $P_{25}$, $P_{50}$, and $P_{95}$ given below.Solution:

Locate the 10th percentile (lower deciles i.e. $D_1$)by $\frac{10 \times n}{100}=\frac{10 \times 3o}{100}=3$ observation.
so, $P_{10}$ group is 85.5–90.5 containing the 3rd observation
\begin{align*}
P_{10}&=l+\frac{h}{f}\left(\frac{10 n}{100}-c\right)\\
&=85.5+\frac{5}{6}(3-0)\\
&=85.5+2.5=88
\end{align*}
Locate the 25th percentile (lower quartiles i.e. $Q_1$) by $\frac{10 \times n}{100}=\frac{25 \times 3o}{100}=7.5$ observation.
so, $P_{25}$ group is 90.5–95.5 containing the 7.5th observation
\begin{align*}
P_{25}&=l+\frac{h}{f}\left(\frac{25 n}{100}-c\right)\\
&=90.5+\frac{5}{4}(7.5-6)\\
&=90.5+1.875=92.375
\end{align*}
Locate the 50th percentile (Median i.e. 2nd quartiles, 5th deciles) by $\frac{50 \times n}{100}=\frac{50 \times 3o}{100}=15$ observation.
so, P₅₀ group is 95.5–100.5 containing the 15th observation
\begin{align*}
P_{50}&=l+\frac{h}{f}\left(\frac{50 n}{100}-c\right)\\
&=95.5+\frac{5}{10}(15-10)\\
&=95.5+2.5=98
\end{align*}
Locate the 95th percentile by $\frac{95 \times n}{100}=\frac{95 \times 30}{100}=28.5$th observation.
so, $P_{95}$ group is 105.5–110.5 containing the 3rd observation
\begin{align*}
P_{95}&=l+\frac{h}{f}\left(\frac{95 n}{100}-c\right)\\
&=105.5+\frac{5}{3}(28.5-26)\\
&=105.5+4.1667=109.6667
\end{align*}

The percentiles and quartiles may be read directly from the graphs of the cumulative frequency function.

Further Reading: https://en.wikipedia.org/wiki/Percentile

Drawing Graphs and Charts in R Language

Measure of Dispersion or Variability (2012)

Aug 3, 2024Sep 1, 2012 by Muhammad Imdad Ullah

Introduction to Measure of Dispersion

The measure of location or averages or central tendency is not sufficient to describe the characteristics of a distribution, because two or more distributions may have averages that are exactly alike, even though the distributions are dissimilar in other aspects. On the other hand, a measure of central tendency represents the typical value of the data set. To give a sensible description of data, a numerical quantity called the measure of dispersion/ variability or scatter that describes the spread of the values in a set of data has two types of measures of dispersion or variability:

Absolute Measures
Relative Measures

A measure of central tendency together with a measure of dispersion gives an adequate description of data as compared to the use of a measure of location only, because the averages or measures of central tendency only describe the balancing point of the data set, it does not provide any information about the degree to which the data tend to spread or scatter about the average value. So, the Measure of dispersion indicates the characteristic of the central tendency measure. The smaller the variability of a given set, the more the values of the measure of averages will represent the data set.

Absolute Measure of Dispersion

Absolute measures are defined in such a way that they have units such as meters, grams, etc., the same as those of the original measurements. Absolute measures cannot be used to compare the variation/spread of two or more data sets.
Most Common absolute measures of variability are:

Range
Semi-Interquartile Range or Quartile Deviation
Mean Deviation
Variance
Standard Deviation

Relative Measures of Dispersion

The relative measures have no units as these are ratios, coefficients, or percentages. Relative measures are independent of units of measurement and are useful for comparing data of different natures.

Coefficient of Variation
Coefficient of Mean Deviation
Coefficient of Quartile Deviation
Coefficient of Standard Deviation

Different terms are used for the measure of dispersion or variability such as variability, spread, scatterness, the measure of uncertainty, deviation, etc.

References:
http://www2.le.ac.uk/offices/careers/ld/resources/numeracy/variability

R Language Frequently Asked Questions

Absolute Measure of Dispersion

Table of Contents

Absolute Measure of Dispersion

Range

Quartile Deviation (Semi-Interquantile Range)

Mean Deviation (Average Deviation)

Variance and Standard Deviation

Percentiles: Relative Standing

Table of Contents

Percentiles for the Ungrouped Data

Ungrouped Data Example

Percentiles for the Frequency Distribution Table (Grouped data)

Grouped Data Example

Measure of Dispersion or Variability (2012)

Introduction to Measure of Dispersion

Table of Contents

Absolute Measure of Dispersion

Relative Measures of Dispersion

Table of Contents

Absolute Measure of Dispersion

Range

Quartile Deviation (Semi-Interquantile Range)

Mean Deviation (Average Deviation)

Variance and Standard Deviation

Share this:

Table of Contents

Percentiles for the Ungrouped Data

Ungrouped Data Example

Percentiles for the Frequency Distribution Table (Grouped data)

Grouped Data Example

Share this:

Introduction to Measure of Dispersion

Table of Contents

Absolute Measure of Dispersion

Relative Measures of Dispersion

Share this: