### Secular Trend â€” Nonlinear Trends

When a straight line does not describe accurately the long-term movement of a time series, then one might detect some curvature and decide to fit a curve instead of a straight line.

## Table of Contents

The most commonly used curve, to describe the nonlinear secular trend in a time series, are:

- Exponential curve, and
- Second-degree parabola

### Exponential (Nonlinear) Curve

The exponential curve describes the trend (nonlinear) in a time series that changes by a constant percentage rate. The equation of the curve is $\hat{y} = ab^x$

Taking logarithm, we get the linear form $log\, \hat{y}=log\, a + (log\,b)x$

The method of least squares gives the normal equations:

\begin{align*}

\sum log\, y & = n\, log\, a + log\, b \sum x\\

\sum log\, y & = n\, log\, a \sum x + log\, b \sum x^2

\end{align*}

However, if $\sum x=0$ the normal equations becomes

\begin{align*}

\sum log\,y & = n\, log a\\

\sum x log\, y &= log\, b \sum x^2

\end{align*}

The values of $log\, a$ and $log\, b$ are

\begin{align*}

log\, a &=\frac{\sum log\, y}{n}\\

log\, b&= \frac{\sum x log\, y}{\sum x^2}

\end{align*}

Taking $antilog$ of of $log\, a$ and $log\, b$, we get the values of $a$ and $b$.

**Question**: The population of a country for the years 1911 to 1971 in ten yearly intervals in millions is 5.38, 7.22, 9.64, 12.70, 17.80, 24.02, and 31.34. **(i)** Fit a curve of the type $\hat{y}=ab^x$ to this time series and find the trend values, **(ii)** Forecast the population for the year 1991.

**solution**

**(i)** We have $\overline{t}=\frac{(1991+1971)}{2}=1941$. Let $x=\frac{t-\overline{t}}{10}=\frac{5-1941}{10}$ so that coded year number $x$ is measured in a unit of 10 years.

Year $t$ | Population $y$ | Coded Year $x=\frac{x-1941}{10}$ | $log y$ | $x log\, y$ | $x^2$ | $\hat{y}=13.029(1.345)^x$ |
---|---|---|---|---|---|---|

1911 | 5.38 | -3 | 0.73078 | -2.19234 | 9 | 5.355 |

1921 | 7.22 | -2 | 0.85854 | -1.71708 | 4 | 7.202 |

1931 | 9.64 | -1 | 0.98408 | -0.98498 | 1 | 9.687 |

1941 | 12.70 | 0 | 1.10380 | 0 | 0 | 13.029 |

1951 | 17.808 | 1 | 1.25042 | 1.25042 | 1 | 17.524 |

1961 | 24.02 | 2 | 1.38057 | 2.76114 | 4 | 23.570 |

1971 | 31.34 | 3 | 1.49610 | 4.48830 | 9 | 31.701 |

The least squares exponential curve is $\hat{y} = ab^x$

Taking logarithm, $log\, \hat{y} = log a + (log\, b)x$

since $\sum x=0$, therefore

\begin{align*}

log\, a &= \frac{\sum log\, y}{n} = \frac{7.80429}{7}=1.1149\\

log\, b &= \frac{\sum x log\, y}{\sum x^2} = \frac{3.60636}{28}=0.12880\\

a &= antilog(1.1149)=13.029\\

b &= antilog(0.1288)=1.345\\

\hat{y} &=13.029 (1.345)^x,\quad \text{with origin at 1941 and unit of $x$ as 10 years}

\end{align*}

(**ii)** For $t=1941$ we have $x=\frac{t-1941}{10}= \frac{1991-1994}{10}=5$. Putting $x=5$, in the least squares exponential curve, we have

$\hat{y} = 13.029 (1.345)^5 = 57.348$ millions

### Second Degree Parabola (Nonlinear Trend)

It describes the trend (nonlinear) in a time series where a change in the amount of change is constant per unit of time. The quadratic (parabolic) trend can be described by the equation

\begin{align*}

\hat{y} = a + bx + cx^2

\end{align*}

The method of least squares gives the normal equations as

\begin{align*}

\sum y &= na + b\sum x + c \sum x^2\\

\sum xy &= a\sum x + b\sum x^2 + c \sum x^3\\

\sum x^2y &= a \sum x^2 + b\sum x^3 + c\sum x^4

\end{align*}

However if $\sum x = 0 \sum x^3$ then the normal equation reduces to

\begin{align*}

\sum y &= na + c\sum x^2\\

\sum xy &= b\sum x^2\\

\sum x^2 y &= a \sum x^2 + c \sum x^4\\

& \text{the values of $a$, $b$, and $c$ can be found as}\\

c &= \frac{n \sum x^2 y – (\sum x^2)(\sum y)}{n \sum x^2 -(\sum x^2)^2}\\

a&=\frac{\sum y – c\sum x^2}{n}\\

b&= \frac{\sum xy}{\sum x^2}

\end{align*}

**Question**: Given the following time series

Year | 1931 | 1933 | 1935 | 1937 | 1939 | 1941 | 1943 | 1945 |

Price Index | 96 | 87 | 91 | 102 | 108 | 139 | 307 | 289 |

- Fit a second-degree parabola taking the origin in 1938.
- Find the trend values
- What would have been the equation of the parabola if the origin were in 1933

**Solution**

**(i)**

Year $t$ | Price index $y$ | Coded Year $x=t-1938$ | $x^2$ | $x^4$ | $xy$ | $x^2y$ | Trend values $y=110.2+15.48x+2.01 x^2$ |
---|---|---|---|---|---|---|---|

1931 | 96 | -7 | 49 | 2401 | -672 | 4704 | 100.33 |

1933 | 87 | -5 | 25 | 625 | -435 | 2175 | 83.05 |

1935 | 91 | -3 | 9 | 81 | -273 | 819 | 81.85 |

1937 | 102 | -1 | 1 | 1 | -102 | 102 | 96.73 |

1939 | 108 | 1 | 1 | 1 | 108 | 108 | 127.69 |

1941 | 139 | 3 | 9 | 81 | 417 | 1251 | 174.73 |

1943 | 307 | 5 | 25 | 625 | 1535 | 7675 | 237.85 |

1945 | 289 | 7 | 49 | 2401 | 2023 | 14161 | 317.05 |

Total | 1219 | 0 | 168 | 6216 | 2601 | 30995 |

(ii) Different trend values are already computed in the above table.

\begin{align*}

\hat{y} &= a + b x + c x^2\\

c &= \frac{n\sum x^2 y-(\sum x^2)(\sum y)}{n \sum x^4 -(\sum x^2)^2} =\frac{8(30995)-(168)(1219)}{8(6126)-(168)^2}=2.01\\

a &= \frac{\sum y – a \sum x^2}{n}=\frac{1219-(2.01)(168)}{8}=119.2\\

b &= \frac{\sum xy}{\sum x^2}=\frac{2601}{168} = 15.48\\

\hat{y} &= 110.2 + 15.48x + 2.01^2,\quad \text{with origin at the year 1938}

\end{align*}

For different values of $x$, the trend values are obtained in the table.

For shifting the origin at 1933, replace $x$ by $(x-5)$

\begin{align*}

\hat{y} &= 110.2 + 15.48(x-5)+2.01(x-5)^2\\

&= 110.2 + 15.48(x-5)+2.01(x^2 -10x + 25)\\

&= 110.2 + 15.48x -77.4 + 2.01x^2 – 20.1x + 50.25\\

&= 83.05 -4.62x + 2.01x^2, \quad \text{with origin at the year 1933}

\end{align*}

### Merits of Least Squares

- The method of least squares gives the most satisfactory measurement of the secular trend in a time series when the distribution of the deviations is approximately normal.
- The least-squares estimates are unbiased estimates of the parameters.
- The method can be used when the trend is linear, exponential, or quadratic.

### Demerits of Least Squares

- The method of least squares method gives too much weight to extremely large deviations from the trend
- The least-squares line is the best only for the period to which it has reference.
- The elimination or addition for a few or more periods may change its position.

Statistical Models in R Programming Language