Introduction - Statistics for Data Science & Analytics

Seasonal Variations: Estimation (2020)

May 17, 2024Nov 25, 2020 by Muhammad Imdad Ullah

We have to find a way of isolating and measuring the seasonal variations. There are two reasons for isolating and measuring the effect of seasonal variations.

To study the changes brought by seasons in the values of the given variable in a time series
To remove it from the time series to determine the value of the variable

Summing the values of a particular season for several years, the irregular variations will cancel each other, due to independent random disturbances. If we also eliminate the effect of trend and cyclical variations, the seasonal variations will be left out which are expressed as a percentage of their average.

Seasonal Variations

A study of seasonal variation leads to more realistic planning of production and purchases etc.

Seasonal Index Method

When the effect of the trend has been eliminated, we can calculate a measure of seasonal variation known as the seasonal index. A seasonal index is simply an average of the monthly or quarterly value of different years expressed as a percentage of averages of all the monthly or quarterly values of the year.

The following methods are used to estimate seasonal variations.

Average percentage method (simple average method)
Link relative method
Ratio to the trend of short-time values
Ratio to the trend of long-time averages projected to short times
Ratio to moving average

The Simple Average Method

Assume the series is expressed as

$$Y=TSCI$$

Consider the long-time averages as trend values and eliminate the trend element by expressing a short-time observed value as a percentage of the corresponding long-time average. In the multiplicative model, we obtain

\begin{align*}
\frac{\text{short time observed value} }{\text{long time average}}\times &= \frac{TSCI}{T}\times 100\\
&=SCI\times 100
\end{align*}

This percentage of the long-time average represents the seasonal (S), the cyclical (C), and the irregular (I) component.

Once $SCI$ is obtained, we try to remove $CI$ as much as possible from $SCI$. This is done by arranging these percentages season-wise for all the long times (say years) and taking the modified arithmetic mean for each season by ignoring both the smallest and the largest percentages. These would be seasonal indices.

If the average of these indices is not 100, then the adjustment can be made, by expressing these seasonal indices as the percentage of their arithmetic mean. The adjustment factor would be

\begin{align*}
\frac{100}{\text{Mean of Seasonal Indiex}} \rightarrow \frac{400}{\text{sums of quarterly index}} \,\, \text{ or } \frac{1200}{\text{sums of monthly indices}}
\end{align*}

Seasonal Variations: Objective of Time Series

Example of Seasonal Variations

Question: The following data is about several automobiles sold.

Year	Quarter 1	Quarter 2	Quarter 3	Quarter 4
1981	250	278	315	288
1982	247	265	301	285
1983	261	285	353	373
1984	300	325	370	343
1985	281	317	381	374

Calculate the seasonal indices by the average percentage method.

Solution:

First, we obtain the yearly (long-term) averages

Year	1981	1982	1983	1984	1985
Year Total	1131	1098	1272	1338	1353
Yearly Average	1131/4=282.75	274.50	318.00	334.50	338.25

Next, we divide each quarterly value by the corresponding yearly average and express the results as percentages. That is,

Year	Quarter 1	Quarter 2	Quarter 3	Quarter 4
1981	$\frac{250}{282.75}\times=88.42$	$\frac{278}{282.75}\times=98.32^*$	Total (modified)	$\frac{288}{282.75}\times=101.86^*$
1982	$\frac{247}{274.50}\times=89.98^*$	$\frac{265}{274.50}\times=96.54$	$\frac{301}{274.50}\times=109.65^*$	$\frac{285}{274.50}\times=103.83$
1983	$\frac{261}{318.00}\times=82.08^*$	$\frac{285}{318.00}\times=89.62^*$	$\frac{353}{318.00}\times=111.01$	$\frac{373}{318.00}\times=117.30^*$
1984	$\frac{300}{334.50}\times=89.69$	$\frac{325}{334.50}\times=97.16$	$\frac{370}{334.50}\times=110.61$	$\frac{343}{334.50}\times=102.54$
1985	$\frac{281}{338.25}\times=83.07$	$\frac{317}{338.25}\times=93.72$	$\frac{381}{338.25}\times=112.64^*$	$\frac{374}{338.25}\times=110.57$
Total (modified)	261.18	247.42	333.03	316.94	Total
Mean (modified)	$\frac{261.18}{3}=87.06$	$\frac{247.42}{3}=95.81$	$\frac{333.03}{3}=111.01$	$\frac{316.94}{3}=105.65$	399.52

* on values represents the smallest and largest values in a quarter that are not included in the total.

Statistical Software for Seasonal Variation

Several statistical software packages can automate these calculations for you. Popular options include:

Python libraries like Pandas and Statsmodels
R statistical computing environment
Excel with add-in tools like Data Analysis ToolPak

Computer MCQs Online Test

R Programming Language

Coding Time Variable (2020)

Apr 29, 2024Sep 2, 2020 by Muhammad Imdad Ullah

Coding Time Variable by Taking Origin at the Beginning

Suppose we have time-series data for the years 1990, 1991, 1992, and 1994.

We can take the origin of a time series at the beginning and assign $x = 0$ to the first period and $1, 2, 3, …$ to other periods. The code for the year will be

Coding Time Variable by Taking Middle Years as Zero

To simplify the trend calculations, the time variable $t$ (year variable) is coded by taking deviations $t-\overline{t}$, where $\overline{t}$ is the average number computed as $\overline{t}=\frac{First\, Period + Last\, Period}{2}$. Taking $x=t-\overline{t}$ we get
$$\sum x = 0 = \sum x^3 = \sum x^5 = \cdots$$

There are two cases when coding a Time Variable (when taking zero in the Middle):

When there are an odd number of Years:
For an odd number of years (as in the period 1990 to 1994) the $\overline{t}$ is the middle point. The $\overline{t}$ is $\overline{t} = (1990+1994)/2=1992$ the code for the year $t$ is $x=t-\overline{t}$. For t=1990, we have $x=1990-1992 =0$. Thus the coded year is zero at $\overline{t}$. Now after taking x=0 at the middle of an odd number of years, we assign $-1, -2, …$ for the years before the middle of the year and $1,2,…$ for the years after the middle year.

Year (t) $x=t-\overline{t}$

1990 -2

1991 -1

1992 0

1993 1

1994 2
When there are even numbers of years
Suppose we have time-series data for the years 1990, 1991, 1992, 1993, 1994, and 1995. The value of middle point is $\overline{t} = (1990+1995)/2 = 1992.5$. So $x=0$ halfway between the years 1992 and 1993 (in the middle of 1992 and 1993). For $t=1992$, we have $x=t-\overline{t}=1992-1992.5=-0.5$. Thus coding the middle of an even number of years as $x=0$, we assign $-0.5, -1.5, -2.5, …$ for the years before the middle year and $0.5, 1.5, 2.5, …$ for the years after the middle year as shown below

Year(t)	$x=t-\overline{t}$	$x=\frac{t-\overline{t}}{1/2}$
1990	-2.5	-5
1991	-1.5	-3
1992	-0.5	1
1993	0.5	1
1994	1.5	3
1995	2.5	5

To avoid decimals in the coded year, we can take the unit of measurement as $\frac{1}{2}$ year. Therefore, after coding $x=0$ in the middle of an even number of years, we assign $-1,-3, -5,…$ for the year before the middle year and $1,3,5,…$ for the years after the middle year as shown above.

Read more about Coding Time Variables in R

R Programming Language

Computer MCQs

Multiplicative Models and Additive Models (2020)

Jun 1, 2025Aug 29, 2020 by Muhammad Imdad Ullah

Here we will discuss the multiplicative models and Additive Models.

The analysis of a time series is the decomposition of a time series into its different components for their separate study. The process of analyzing a time series is to isolate and measure its various components. We try to answer the following questions when we analyze a time series.

What would have been the value of the variable at different points in time if it were influenced only by long-time movements?
What changes occur in the value of the variable due to seasonal variations?
To what extent and in what direction has the variable been affected by cyclical fluctuations?
What has been the effect of irregular variations?

The study of a time series is mainly required for estimation and forecasting. An ideal forecast should be based on forecasts of the various types of fluctuations. Separate forecasts should be made of the trend, seasonal, and cyclical variations. These forecasts become doubtful for a forecast of irregular movements. Therefore, it is necessary to separate and measure various types of fluctuations present in a time series.

A value of a time series variable is considered as the result of the combined impact of its components. The components of a time series follow either the multiplicative or the additive model.

Fro both Multiplicative and additive models, let $Y$= original observation, $T$= trend component, $S$=seasonal component, $C$=cyclical component, and $I$=irregular component.

Multiplicative Models

It is assumed that the value $Y$ of a composite series is the product of the four components. That is

$$Y = T \times S \times C \times I,$$

where $T$ is given in original units of $Y$, but $S$, $C$, and $I$ are expressed as percentage unit-less index numbers.

Additive Models

It is assumed that the value of $Y$ of a composite series is the sum of the four components. That is

$$Y = T + S + C + I,$$

where $T$, $S$, $C$, and $I$ all are given in the original units of $Y$.

Time series analysis is the analysis of a series of data points over time, allowing one to answer a question such as what is the causal effect on a variable $Y$ of a change in variable $X$ over time? An important difference between time series and cross-section data is that the ordering of cases does matter in time series.

Multiplicative Models and Additive Model — Component of Time Series Data

Rather than dealing with individuals as units, the unit of interest is time: the value of $Y$ at time $t$ is $Y_t$. The unit of time can be anything from days to election years. The value of $Y_t$ in the previous period is called the first lag value: $Y_{t-1}$. The jth lag is denoted: $Y_{t-j}$. Similarly, $Y_{t+1}$ is the value of $Y_t$ in the next period. So a simple bivariate regression equation for time series data looks like: \[Y_t = \beta_0 + \beta X_t + u_t\]

$Y_t$ is treated as a random variable. If $Y_t$ is generated by some model (Regression model for time series i.e. $Y_t=x_t\beta +\varepsilon_t$, $E(\varepsilon_t|x_t)=0$, then ordinary least square (OLS) provides a consistent estimates of $\beta$.

See the YouTube video about Multiplicative and Additive Models.

Selection between Multiplicative and Additive Models

A question arose about how to Choose Between Multiplicative and Additive Models. The additive model is useful when the seasonal variation is relatively constant over time. When the seasonal variation increases over time, the multiplicative model is useful.

Read about Introduction to Time Series Data

Learn more about Multiplicative and Additive Models

Frequently Asked Questions about R Programming Language and R Data Analysis

Online Multiple Choice Questions Quizzes Website for Various Subjects with Answers

Seasonal Variations

Seasonal Index Method

The Simple Average Method

Example of Seasonal Variations

Statistical Software for Seasonal Variation

Share this:

Coding Time Variable by Taking Origin at the Beginning

Coding Time Variable by Taking Middle Years as Zero

Share this:

Multiplicative Models

Additive Models

Selection between Multiplicative and Additive Models

Share this: