# Autocorrelation in Time Series Data (2015)

The post is about autocorrelation in time series data. The autocorrelation (serial correlation, or cross-autocorrelation) function (the diagnostic tool) helps to describe the evaluation of a process through time. Inference based on autocorrelation function is often called an analysis in the time domain.

Autocorrelation of a random process is the measure of correlation (relationship) between observations at different distances apart. These coefficients (correlation or autocorrelation) often provide insight into the probability model which generated the data. One can say that autocorrelation is a mathematical tool for finding repeating patterns in the data series.

The detection of autocorrelation in time series data is usually used for the following two purposes:

1. Help to detect the non-randomness in data (the first i.e. lag 1 autocorrelation is performed)
2. Help in identifying an appropriate time series model if the data are not random (autocorrelation is usually plotted for many lags)

For simple correlation, let there are $n$ pairs of observations on two variables $x$ and $y$, then the usual correlation coefficient (Pearson’s coefficient of correlation) is

$r=\frac{\sum(x_i-\overline{x})(y_i-\overline{y})}{\sqrt{\sum (x_i-\overline{x})^2 \sum (y_i-\overline{y})^2 }}$

A similar idea can be used in time series to see whether successive observations are correlated or not. Given $N$ observations $x_1, x_2, \cdots, x_N$ on a discrete time series, we can form ($n-1$) pairs of observations such as $(x_1, x_2), (x_2, x_3), \cdots, (x_{n-1}, x_n)$. Here in each pair first observation is as one variable ($x_t$) and the second observation is as the second variable ($x_{t+1}$). So the correlation coefficient between $x_t$ and $x_{t+1}$ is

$r_1\frac{ \sum_{t=1}^{n-1} (x_t- \overline{x}_{(1)} ) (x_{t+1}-\overline{x}_{(2)}) } {\sqrt{ [\sum_{t=1}^{n-1} (x_t-\overline{x}_{(1)})^2] [ \sum_{t=1}^{n-1} (y_t-\overline{y}_{(1)})^2 ] } }$

where

$\overline{x}_{(1)}=\sum_{t=1}^{n-1} \frac{x_t}{n-1}$ is the mean of first $n-1$ observations

$\overline{x}_{(2)}=\sum_{t=2}^{n} \frac{x_t}{n-1}$ is the mean of last $n-1$ observations

Note that: The assumption is that the observations in autocorrelation are equally spaced (equi-spaced).

It is called autocorrelation or serial correlation coefficient. For large $n$, $r_1$ is approximately

$r_1=\frac{\frac{\sum_{t=1}^{n-1} (x_t-\overline{x})(x_{t+1}-\overline{x}) }{n-1}}{ \frac{\sum_{t=1}^n (x_t-\overline{x})^2}{n}}$

or

$r_1=\frac{\sum_{t=1}^{n-1} (x_t-\overline{x})(x_{t+1}-\overline{x}) } { \sum_{t=1}^n (x_t-\overline{x})^2}$

For $k$ distance apart i.e., for $k$ lags

$r_k=\frac{\sum_{t=1}^{n-k} (x_t-\overline{x})(x_{t+k}-\overline{x}) } { \sum_{t=1}^n (x_t-\overline{x})^2}$

An $r_k$ value of $\pm \frac{2}{\sqrt{n} }$ denotes a significant difference from zero and signifies an autocorrelation.

### Applications of Autocorrelation in Time Series

There are several applications of autocorrelation in Time Series Data. Some of them are described below.

• Autocorrelation analysis is widely used in fluorescence correlation spectroscopy.
• Autocorrelation is used to measure the optical spectra and to measure the very short-duration light pulses produced by lasers.
• Autocorrelation is used to analyze dynamic light scattering data for the determination of the particle size distributions of nanometer-sized particles in a fluid. A laser shining into the mixture produces a speckle pattern. The autocorrelation of the signal can be analyzed in terms of the diffusion of the particles. From this, knowing the fluid viscosity, the sizes of the particles can be calculated using Autocorrelation.
• The small-angle X-ray scattering intensity of a nano-structured system is the Fourier transform of the spatial autocorrelation function of the electron density.
• In optics, normalized autocorrelations and cross-correlations give the degree of coherence of an electromagnetic field.
• In signal processing, autocorrelation can provide information about repeating events such as musical beats or pulsar frequencies, but it cannot tell the position in time of the beat. It can also be used to estimate the pitch of a musical tone.
• In music recording, autocorrelation is used as a pitch detection algorithm before vocal processing, as a distortion effect or to eliminate undesired mistakes and inaccuracies.
• In statistics, spatial autocorrelation between sample locations also helps one estimate mean value uncertainties when sampling a heterogeneous population.
• In astrophysics, auto-correlation is used to study and characterize the spatial distribution of galaxies in the Universe and multi-wavelength observations of Low Mass X-ray Binaries.
• In an analysis of Markov chain Monte Carlo data, autocorrelation must be taken into account for correct error determination.

Further Reading: Autocorrelation in time series