Quantiles or Fractiles Uncovered (2020)

When the number of observations is sufficiently large, the principle by which a distribution is divided into two equal parts may be extended to divide the distribution into four, five, eight, ten, or hundred equal parts. The median, quartiles, deciles, and percentiles values are collectively called quantiles or fractiles. Let us start learning about Quantiles or Fractiles.

Quantiles or Fractiles Uncovered

Quantiles or Fractiles

Quartiles

These are the values that divide a distribution into four equal parts. There are three quartiles denoted by $Q_1, Q_2$, and $Q_3$. If $x_1,x_2,\cdots,x_n$ are $n$ observations on a variable $X$, and $x_{(1)}, x_{(2)}, \cdots, x_{(n)}$ is their array then $r$th quartile $Q_r$ is the values of $X$, such that $\frac{r}{4}$ of the observations is less than that value of $X$ and $\frac{4-r}{4}$ of the observations is greater.

The $Q_1$ is the value of $X$ such that $\frac{1}{4}$ of the observations is less than the value of $X$ and $\frac{4-1}{4}$ of the observations is greater, the $Q_3$ is the value of $X$, such that $\frac{3}{4}$ of the observations is less than that of $X$ and $\frac{4-3}{4}$ of the observations is greater.

Deciles

These are the values that divide a distribution into ten equal parts. There are 9 deciles $D_1, D_2, \cdots, D_9$.

Percentiles

These are the values that divide a distribution into a hundred equal parts. There are 99 percentiles denoted as $P_1,P_2,\cdots, P_{99}$.

The median, quartiles, deciles, percentiles, and other partition values are collectively called quantiles or fractiles. All quantiles are percentages. For example, $P_{50}, Q_2$, and $D_5$ are also median.

\begin{align*}
Q_2 &= D_5 = P_{50}\\
Q_1 &= P_{25} = D_{2.5}\\
Q_3 &= P_{75}=D_{7.5}
\end{align*}
The $r$th quantile, $k$th decile, and $j$th percentile are located in the array by the following relation:

For ungrouped Date
\begin{align}
Q_r &=\frac{r(n+1)}{4}\text{th value in the distribution and } r=1,2,3\\
D_k &=\frac{k(n+1)}{10}\text{th value in the distribution and } k=1,2,\cdots, 9\\
P_j &=\frac{j(n+1)}{100}\text{th value in the distribution and } k=1,2,\cdots, 99
\end{align}

For grouped Data
\begin{align}
Q_r&= l+\frac{h}{f}\left(\frac{rn}{4}-c\right)\\
D_k&= l+\frac{h}{f}\left(\frac{kn}{10}-c\right)\\
P_j&= l+\frac{h}{f}\left(\frac{jn}{100}-c\right)
\end{align}

Procedure for obtaining Percentile

A procedure for obtaining percentile (quartiles, deciles) of a data set of size $n$ is as follows:

Step 1: Arrange the data in ascending/ descending order.
Step 2: Compute an index $i$ as follows: $i=\frac{p}{100} (n+1)$th (in case of odd observation).

  • If $i$ is an integer, the $p$th percentile is the average of the $i$th and $(i+1)$th data values.
  • if $i$ is not an integer then round $i$ up to the nearest integer and take the value at that position or use some mathematics to locate the value of percentile between $i$th and $(i+1)$th value.

Percentile Example

Consider the following (sorted) data values: 380, 600, 690, 890, 1050, 1100, 1200, 1900, 890000.

For the $p=10$th percentile, $i=\frac{p}{100} (n+1) =\frac{10}{100} (9+1)= 1$. So the 10th percentile is the first sorted value or 380.

For the $p=75$ percentile, $i=\frac{p}{100} (n+1)= \frac{75}{100}(9+1) = 7.5$

To get the actual value we need to compute 7th value + (8th value – 7th value) $\times 0.5$. That is, $1200 + (1900-1200)\times 0.5 = 1200+350 = 1550$.

Quantiles or Fractiles

Read More about: Quartiles, Deciles, and Percentiles

Learn R Programming, Test Preparation MCQs

Frequently Asked Questions Fractiles

  1. What is meant by quartile, deciles, and percentiles?
  2. Describe the procedure of obtaining percentiles (quartiles, and deciles).
  3. What is the interquartile range?
  4. Why do we need to sort the data first when computing quartiles, deciles, and percentiles?

Seasonal Variations: Estimation (2020)

We have to find a way of isolating and measuring the seasonal variations. There are two reasons for isolating and measuring the effect of seasonal variations.

  • To study the changes brought by seasons in the values of the given variable in a time series
  • To remove it from the time series to determine the value of the variable

Summing the values of a particular season for several years, the irregular variations will cancel each other, due to independent random disturbances. If we also eliminate the effect of trend and cyclical variations, the seasonal variations will be left out which are expressed as a percentage of their average.

Seasonal Variations

A study of seasonal variation leads to more realistic planning of production and purchases etc.

Seasonal Index Method

When the effect of the trend has been eliminated, we can calculate a measure of seasonal variation known as the seasonal index. A seasonal index is simply an average of the monthly or quarterly value of different years expressed as a percentage of averages of all the monthly or quarterly values of the year.

The following methods are used to estimate seasonal variations.

  • Average percentage method (simple average method)
  • Link relative method
  • Ratio to the trend of short-time values
  • Ratio to the trend of long-time averages projected to short times
  • Ratio to moving average

The Simple Average Method

Assume the series is expressed as

$$Y=TSCI$$

Consider the long-time averages as trend values and eliminate the trend element by expressing a short-time observed value as a percentage of the corresponding long-time average. In the multiplicative model, we obtain

\begin{align*}
\frac{\text{short time observed value} }{\text{long time average}}\times &= \frac{TSCI}{T}\times 100\\
&=SCI\times 100
\end{align*}

This percentage of the long-time average represents the seasonal (S), the cyclical (C), and the irregular (I) component.

Once $SCI$ is obtained, we try to remove $CI$ as much as possible from $SCI$. This is done by arranging these percentages season-wise for all the long times (say years) and taking the modified arithmetic mean for each season by ignoring both the smallest and the largest percentages. These would be seasonal indices.

If the average of these indices is not 100, then the adjustment can be made, by expressing these seasonal indices as the percentage of their arithmetic mean. The adjustment factor would be

\begin{align*}
\frac{100}{\text{Mean of Seasonal Indiex}} \rightarrow \frac{400}{\text{sums of quarterly index}} \,\, \text{ or } \frac{1200}{\text{sums of monthly indices}}
\end{align*}

Seasonal Variations: Objective of Time Series

Example of Seasonal Variations

Question: The following data is about several automobiles sold.

YearQuarter 1Quarter 2Quarter 3Quarter 4
1981250278315288
1982247265301285
1983261285353373
1984300325370343
1985281317381374

Calculate the seasonal indices by the average percentage method.

Solution:

First, we obtain the yearly (long-term) averages

Year19811982198319841985
Year Total11311098127213381353
Yearly Average1131/4=282.75274.50318.00334.50338.25

Next, we divide each quarterly value by the corresponding yearly average and express the results as percentages. That is,

YearQuarter 1Quarter 2Quarter 3Quarter 4
1981$\frac{250}{282.75}\times=88.42$$\frac{278}{282.75}\times=98.32^*$Total (modified)
$\frac{288}{282.75}\times=101.86^*$ 
1982$\frac{247}{274.50}\times=89.98^*$$\frac{265}{274.50}\times=96.54$$\frac{301}{274.50}\times=109.65^*$$\frac{285}{274.50}\times=103.83$ 
1983$\frac{261}{318.00}\times=82.08^*$$\frac{285}{318.00}\times=89.62^*$$\frac{353}{318.00}\times=111.01$$\frac{373}{318.00}\times=117.30^*$ 
1984$\frac{300}{334.50}\times=89.69$$\frac{325}{334.50}\times=97.16$$\frac{370}{334.50}\times=110.61$$\frac{343}{334.50}\times=102.54$ 
1985$\frac{281}{338.25}\times=83.07$$\frac{317}{338.25}\times=93.72$$\frac{381}{338.25}\times=112.64^*$$\frac{374}{338.25}\times=110.57$ 
Total (modified)
261.18247.42333.03316.94Total
Mean (modified)
$\frac{261.18}{3}=87.06$$\frac{247.42}{3}=95.81$$\frac{333.03}{3}=111.01$$\frac{316.94}{3}=105.65$399.52

* on values represents the smallest and largest values in a quarter that are not included in the total.

Statistical Software for Seasonal Variation

Several statistical software packages can automate these calculations for you. Popular options include:

  • Python libraries like Pandas and Statsmodels
  • R statistical computing environment
  • Excel with add-in tools like Data Analysis ToolPak

Computer MCQs Online Test

R Programming Language

Detrending Time Series (2020)

Detrending time series is a process of eliminating the trend component from a time series, where a trend refers to a change in the mean over time (a continuous decrease or increase over time). It means that when data is detrended, an aspect from that data has been removed that you think is causing some kind of distortion.

Assuming the multiplicative model:

$$Detrended\, value = \frac{Y}{T} = \frac{TSCI}{T}=SCI $$

Assuming additive model:

$$Detrended\, value = Y-T=T+S+C+I-T = S+C+I$$

Components of Time Series Data: Detrending Time Series
Component of Time Series Data

Detrending Time Series (Stationary Time Series)

The detrending time series is a process of removing the trend from a non-stationary time series. A detrended time series is known as a stationary time series, while a time series with a trend is a non-stationary time series. A stationary time series oscillates about the horizontal line. If a series does not have a trend or we remove the trend successfully, the series is said to be trend stationary.

Eliminating the trend component may be thought of as rotating the trend line to a horizontal position. The trend component can be eliminated from the observed time series by computing either the ratios to the trend if the multiplicative model is assumed or the deviations from the trend if the additive model is assumed.

Note that the best detrending method depends on the nature of your trend:

  • Use differencing for stationary trends (constant increase/decrease).
  • Use model fitting for more complex trends (curves, changing slopes).

Detrending is often a preparatory step for further analysis such as forecasting and identifying seasonal patterns. On the other hand, detrending might not be necessary if the trend is already incorporated into your analysis. Some methods, like deseasonalizing, can involve both detrending and removing seasonal effects.

Detrending Time Series

Read about Secular Trends in Time Series

Statistics help https://itfeature.com

Learn R Programming Language

Computer MCQs Test with Answers