Secular Trend — Nonlinear Trends
When a straight line does not describe accurately the long-term movement of a time series, then one might detect some curvature and decide to fit a curve instead of a straight line.
Table of Contents
The most commonly used curve, to describe the nonlinear secular trend in a time series, are:
- Exponential curve, and
- Second-degree parabola
Exponential (Nonlinear) Curve
The exponential curve describes the trend (nonlinear) in a time series that changes by a constant percentage rate. The equation of the curve is $\hat{y} = ab^x$
Taking logarithm, we get the linear form $log\, \hat{y}=log\, a + (log\,b)x$
The method of least squares gives the normal equations:
\begin{align*}
\sum log\, y & = n\, log\, a + log\, b \sum x\\
\sum log\, y & = n\, log\, a \sum x + log\, b \sum x^2
\end{align*}
However, if $\sum x=0$ the normal equations becomes
\begin{align*}
\sum log\,y & = n\, log a\\
\sum x log\, y &= log\, b \sum x^2
\end{align*}
The values of $log\, a$ and $log\, b$ are
\begin{align*}
log\, a &=\frac{\sum log\, y}{n}\\
log\, b&= \frac{\sum x log\, y}{\sum x^2}
\end{align*}
Taking $antilog$ of of $log\, a$ and $log\, b$, we get the values of $a$ and $b$.
Question: The population of a country for the years 1911 to 1971 in ten yearly intervals in millions is 5.38, 7.22, 9.64, 12.70, 17.80, 24.02, and 31.34. (i) Fit a curve of the type $\hat{y}=ab^x$ to this time series and find the trend values, (ii) Forecast the population for the year 1991.
solution
(i) We have $\overline{t}=\frac{(1991+1971)}{2}=1941$. Let $x=\frac{t-\overline{t}}{10}=\frac{5-1941}{10}$ so that coded year number $x$ is measured in a unit of 10 years.
Year $t$ | Population $y$ | Coded Year $x=\frac{x-1941}{10}$ | $log y$ | $x log\, y$ | $x^2$ | $\hat{y}=13.029(1.345)^x$ |
---|---|---|---|---|---|---|
1911 | 5.38 | -3 | 0.73078 | -2.19234 | 9 | 5.355 |
1921 | 7.22 | -2 | 0.85854 | -1.71708 | 4 | 7.202 |
1931 | 9.64 | -1 | 0.98408 | -0.98498 | 1 | 9.687 |
1941 | 12.70 | 0 | 1.10380 | 0 | 0 | 13.029 |
1951 | 17.808 | 1 | 1.25042 | 1.25042 | 1 | 17.524 |
1961 | 24.02 | 2 | 1.38057 | 2.76114 | 4 | 23.570 |
1971 | 31.34 | 3 | 1.49610 | 4.48830 | 9 | 31.701 |
The least squares exponential curve is $\hat{y} = ab^x$
Taking logarithm, $log\, \hat{y} = log a + (log\, b)x$
since $\sum x=0$, therefore
\begin{align*}
log\, a &= \frac{\sum log\, y}{n} = \frac{7.80429}{7}=1.1149\\
log\, b &= \frac{\sum x log\, y}{\sum x^2} = \frac{3.60636}{28}=0.12880\\
a &= antilog(1.1149)=13.029\\
b &= antilog(0.1288)=1.345\\
\hat{y} &=13.029 (1.345)^x,\quad \text{with origin at 1941 and unit of $x$ as 10 years}
\end{align*}
(ii) For $t=1941$ we have $x=\frac{t-1941}{10}= \frac{1991-1994}{10}=5$. Putting $x=5$, in the least squares exponential curve, we have
$\hat{y} = 13.029 (1.345)^5 = 57.348$ millions
Second Degree Parabola (Nonlinear Trend)
It describes the trend (nonlinear) in a time series where a change in the amount of change is constant per unit of time. The quadratic (parabolic) trend can be described by the equation
\begin{align*}
\hat{y} = a + bx + cx^2
\end{align*}
The method of least squares gives the normal equations as
\begin{align*}
\sum y &= na + b\sum x + c \sum x^2\\
\sum xy &= a\sum x + b\sum x^2 + c \sum x^3\\
\sum x^2y &= a \sum x^2 + b\sum x^3 + c\sum x^4
\end{align*}
However if $\sum x = 0 \sum x^3$ then the normal equation reduces to
\begin{align*}
\sum y &= na + c\sum x^2\\
\sum xy &= b\sum x^2\\
\sum x^2 y &= a \sum x^2 + c \sum x^4\\
& \text{the values of $a$, $b$, and $c$ can be found as}\\
c &= \frac{n \sum x^2 y – (\sum x^2)(\sum y)}{n \sum x^2 -(\sum x^2)^2}\\
a&=\frac{\sum y – c\sum x^2}{n}\\
b&= \frac{\sum xy}{\sum x^2}
\end{align*}
Question: Given the following time series
Year | 1931 | 1933 | 1935 | 1937 | 1939 | 1941 | 1943 | 1945 |
Price Index | 96 | 87 | 91 | 102 | 108 | 139 | 307 | 289 |
- Fit a second-degree parabola taking the origin in 1938.
- Find the trend values
- What would have been the equation of the parabola if the origin were in 1933
Solution
(i)
Year $t$ | Price index $y$ | Coded Year $x=t-1938$ | $x^2$ | $x^4$ | $xy$ | $x^2y$ | Trend values $y=110.2+15.48x+2.01 x^2$ |
---|---|---|---|---|---|---|---|
1931 | 96 | -7 | 49 | 2401 | -672 | 4704 | 100.33 |
1933 | 87 | -5 | 25 | 625 | -435 | 2175 | 83.05 |
1935 | 91 | -3 | 9 | 81 | -273 | 819 | 81.85 |
1937 | 102 | -1 | 1 | 1 | -102 | 102 | 96.73 |
1939 | 108 | 1 | 1 | 1 | 108 | 108 | 127.69 |
1941 | 139 | 3 | 9 | 81 | 417 | 1251 | 174.73 |
1943 | 307 | 5 | 25 | 625 | 1535 | 7675 | 237.85 |
1945 | 289 | 7 | 49 | 2401 | 2023 | 14161 | 317.05 |
Total | 1219 | 0 | 168 | 6216 | 2601 | 30995 |
(ii) Different trend values are already computed in the above table.
\begin{align*}
\hat{y} &= a + b x + c x^2\\
c &= \frac{n\sum x^2 y-(\sum x^2)(\sum y)}{n \sum x^4 -(\sum x^2)^2} =\frac{8(30995)-(168)(1219)}{8(6126)-(168)^2}=2.01\\
a &= \frac{\sum y – a \sum x^2}{n}=\frac{1219-(2.01)(168)}{8}=119.2\\
b &= \frac{\sum xy}{\sum x^2}=\frac{2601}{168} = 15.48\\
\hat{y} &= 110.2 + 15.48x + 2.01^2,\quad \text{with origin at the year 1938}
\end{align*}
For different values of $x$, the trend values are obtained in the table.
For shifting the origin at 1933, replace $x$ by $(x-5)$
\begin{align*}
\hat{y} &= 110.2 + 15.48(x-5)+2.01(x-5)^2\\
&= 110.2 + 15.48(x-5)+2.01(x^2 -10x + 25)\\
&= 110.2 + 15.48x -77.4 + 2.01x^2 – 20.1x + 50.25\\
&= 83.05 -4.62x + 2.01x^2, \quad \text{with origin at the year 1933}
\end{align*}
Merits of Least Squares
- The method of least squares gives the most satisfactory measurement of the secular trend in a time series when the distribution of the deviations is approximately normal.
- The least-squares estimates are unbiased estimates of the parameters.
- The method can be used when the trend is linear, exponential, or quadratic.
Demerits of Least Squares
- The method of least squares method gives too much weight to extremely large deviations from the trend
- The least-squares line is the best only for the period to which it has reference.
- The elimination or addition for a few or more periods may change its position.
Statistical Models in R Programming Language