The post is about autocorrelation in time series data. The autocorrelation (serial correlation, or cross-autocorrelation) function (the diagnostic tool) helps to describe the evaluation of a process through time. Inference based on autocorrelation function is often called an analysis in the time domain.
Table of Contents
Autocorrelation of a random process is the measure of correlation (relationship) between observations at different distances apart. These coefficients (correlation or autocorrelation) often provide insight into the probability model which generated the data. One can say that autocorrelation is a mathematical tool for finding repeating patterns in the data series.
Purpose of Detecting Autocorrelation
The detection of autocorrelation in time series data is usually used for the following two purposes:
- Help to detect the non-randomness in data (the first i.e. lag 1 autocorrelation is performed)
- Help in identifying an appropriate time series model if the data are not random (autocorrelation is usually plotted for many lags)
Autocorrelation Formula
For simple correlation, let there are $n$ pairs of observations on two variables $x$ and $y$, then the usual correlation coefficient (Pearson’s coefficient of correlation) is
\[r=\frac{\sum(x_i-\overline{x})(y_i-\overline{y})}{\sqrt{\sum (x_i-\overline{x})^2 \sum (y_i-\overline{y})^2 }}\]
A similar idea can be used in time series to see whether successive observations are correlated or not. Given $N$ observations $x_1, x_2, \cdots, x_N$ on a discrete time series, we can form ($n-1$) pairs of observations such as $(x_1, x_2), (x_2, x_3), \cdots, (x_{n-1}, x_n)$. Here in each pair first observation is as one variable ($x_t$) and the second observation is as the second variable ($x_{t+1}$). So the correlation coefficient between $x_t$ and $x_{t+1}$ is
\[r_1\frac{ \sum_{t=1}^{n-1} (x_t- \overline{x}_{(1)} ) (x_{t+1}-\overline{x}_{(2)}) } {\sqrt{ [\sum_{t=1}^{n-1} (x_t-\overline{x}_{(1)})^2] [ \sum_{t=1}^{n-1} (y_t-\overline{y}_{(1)})^2 ] } }\]
where
$\overline{x}_{(1)}=\sum_{t=1}^{n-1} \frac{x_t}{n-1}$ is the mean of first $n-1$ observations
$\overline{x}_{(2)}=\sum_{t=2}^{n} \frac{x_t}{n-1}$ is the mean of last $n-1$ observations
Note that: The assumption is that the observations in autocorrelation are equally spaced (equi-spaced).
It is called autocorrelation or serial correlation coefficient. For large $n$, $r_1$ is approximately
\[r_1=\frac{\frac{\sum_{t=1}^{n-1} (x_t-\overline{x})(x_{t+1}-\overline{x}) }{n-1}}{ \frac{\sum_{t=1}^n (x_t-\overline{x})^2}{n}}\]
or
\[r_1=\frac{\sum_{t=1}^{n-1} (x_t-\overline{x})(x_{t+1}-\overline{x}) } { \sum_{t=1}^n (x_t-\overline{x})^2}\]
For $k$ distance apart i.e., for $k$ lags
\[r_k=\frac{\sum_{t=1}^{n-k} (x_t-\overline{x})(x_{t+k}-\overline{x}) } { \sum_{t=1}^n (x_t-\overline{x})^2}\]
An $r_k$ value of $\pm \frac{2}{\sqrt{n} }$ denotes a significant difference from zero and signifies an autocorrelation.
Applications of Autocorrelation in Time Series
There are several applications of autocorrelation in Time Series Data. Some of them are described below.
- Autocorrelation analysis is widely used in fluorescence correlation spectroscopy.
- Autocorrelation is used to measure the optical spectra and to measure the very short-duration light pulses produced by lasers.
- Autocorrelation is used to analyze dynamic light scattering data for the determination of the particle size distributions of nanometer-sized particles in a fluid. A laser shining into the mixture produces a speckle pattern. The autocorrelation of the signal can be analyzed in terms of the diffusion of the particles. From this, knowing the fluid viscosity, the sizes of the particles can be calculated using Autocorrelation.
- The small-angle X-ray scattering intensity of a nano-structured system is the Fourier transform of the spatial autocorrelation function of the electron density.
- In optics, normalized autocorrelations and cross-correlations give the degree of coherence of an electromagnetic field.
- In signal processing, autocorrelation can provide information about repeating events such as musical beats or pulsar frequencies, but it cannot tell the position in time of the beat. It can also be used to estimate the pitch of a musical tone.
- In music recording, autocorrelation is used as a pitch detection algorithm before vocal processing, as a distortion effect or to eliminate undesired mistakes and inaccuracies.
- In statistics, spatial autocorrelation between sample locations also helps one estimate mean value uncertainties when sampling a heterogeneous population.
- In astrophysics, auto-correlation is used to study and characterize the spatial distribution of galaxies in the Universe and multi-wavelength observations of Low Mass X-ray Binaries.
- In an analysis of Markov chain Monte Carlo data, autocorrelation must be taken into account for correct error determination.
Real-Life Examples of Autocorrelation in Time Series
The following are a few real-life examples that make use of autocorrelation in Time Series.
- Stock Market Prices: Due to market trends, investor sentiment, and momentum trading, Today’s stock price is often influenced by yesterday’s price. Autocorrelation is involved because prices rarely move randomly; instead, prices follow short-term trends (momentum) or mean-reverting patterns.
- Sales & Demand Forecasting: In December, the retail sales are influenced by previous months due to holiday trends. Seasonal effects (e.g., Eid sales) and consumer habits create dependencies over time, and hence autocorrelation may help to observe the relationship.
- Heart Rate Monitoring (Biomedical Data): A person’s heart rate at any moment is influenced by their heart rate a few seconds earlier. Physiological signals are rarely random; they follow rhythmic patterns.
- Economic Indicators (GDP, Unemployment, Inflation): A country’s GDP growth in one quarter often depends on the previous quarter’s performance. Economics systems have inertia, booms/recessions do not change abruptly.
- Website Traffic: The number of visitors to an e-commerce site at 8 PM may resemble traffic at 8 PM the previous day. User behavior follows daily/weekly cycles (e.g., more traffic at night or on weekends).
- Temperature Data: Regarding the weather data, the temperature today is highly correlated with the temperature yesterday or the past few days. Here autocorrelation helps to determine whether Weather patterns persist due to seasonal cycles and atmospheric conditions.
- Electricity Consumption: Power usage at 3 PM today is likely similar to 3 PM yesterday due to daily routines (e.g., office hours, AC usage). Human activity follows cyclical patterns (daily, weekly, seasonal).
- Traffic Flow & Congestion: Highway traffic at 5 PM today is likely similar to 5 PM yesterday due to rush-hour patterns. Commuter behavior is repetitive and time-dependent.
Why Does Autocorrelation Matter?
- Forecasting: Autocorrelation helps models like ARIMA predict future values based on past trends.
- Model Accuracy: Ignoring autocorrelation can lead to biased estimates in regression models.
- Signal Processing: Used in noise reduction and pattern detection (e.g., ECG analysis).
Further Reading: Autocorrelation in time series
FAQSs about Autocorrelation in Time Series
- What is the use and application of autocorrelation?
- Give some real-life examples and applications of autocorrelation in Time Series Data.
- What is the purpose of the Autocorrelation analysis?
- Why does autocorrelation in time series matter? Discuss in detail.