Autocorrelation Time Series Data

Autocorrelation (serial correlation, or cross-autocorrelation) function (the diagnostic tool) helps to describe the evaluation of a process through time. Inference based on autocorrelation function is often called an analysis in the time domain.

Autocorrelation of a random process, is the measure of correlation (relationship) between observations at different distances apart. This coefficients (correlation or autocorrelation) often provide insight into the probability model which generated the data. One can say that autocorrelation is a mathematical tool for finding repeating patterns in the data series.

Autocorrelation is usually used for the following two purposes:

1. Help to detect the non-randomness in data (the first i.e. lag 1 autocorrelation is performed)
2. Help in identifying an appropriate time series model if the data are not random (autocorrelation are usually plotted for many lags)

For simple correlation, let there are $n$ pairs of observations on two variables $x$ and $y$, then the usual correlation coefficient (Pearson’s coefficient of correlation) is

$r=\frac{\sum(x_i-\overline{x})(y_i-\overline{y})}{\sqrt{\sum (x_i-\overline{x})^2 \sum (y_i-\overline{y})^2 }}$

Similar idea can be used to time series to see either successive observations are correlated or not. Given $N$ observations $x_1, x_2, \cdots, x_N$ on a discrete time series, we can form ($n-1$) pairs of observations such as $(x_1, x_2), (x_2, x_3), \cdots, (x_{n-1}, x_n)$. Here in each pair first observation is as one variable ($x_t$) and the second observation is as second variable ($x_{t+1}$). So the correlation coefficient between $x_t$ and $x_{t+1}$ is

$r_1\frac{ \sum_{t=1}^{n-1} (x_t- \overline{x}_{(1)} ) (x_{t+1}-\overline{x}_{(2)}) } {\sqrt{ [\sum_{t=1}^{n-1} (x_t-\overline{x}_{(1)})^2] [ \sum_{t=1}^{n-1} (y_t-\overline{y}_{(1)})^2 ] } }$

where

$\overline{x}_{(1)}=\sum_{t=1}^{n-1} \frac{x_t}{n-1}$ is the mean of first $n-1$ observations

$\overline{x}_{(2)}=\sum_{t=2}^{n} \frac{x_t}{n-1}$ is the mean of last $n-1$ observations

Note that: The assumption is that the observations in autocorrelation are equally spaced (equi-spaced).

It is called autocorrelation or serial correlation coefficient. For large $n$, $r_1$ is approximately

$r_1=\frac{\frac{\sum_{t=1}^{n-1} (x_t-\overline{x})(x_{t+1}-\overline{x}) }{n-1}}{ \frac{\sum_{t=1}^n (x_t-\overline{x})^2}{n}}$

or

$r_1=\frac{\sum_{t=1}^{n-1} (x_t-\overline{x})(x_{t+1}-\overline{x}) } { \sum_{t=1}^n (x_t-\overline{x})^2}$

For $k$ distance apart i.e., for $k$ lags

$r_k=\frac{\sum_{t=1}^{n-k} (x_t-\overline{x})(x_{t+k}-\overline{x}) } { \sum_{t=1}^n (x_t-\overline{x})^2}$

An $r_k$ value of $\pm \frac{2}{\sqrt{n} }$ denotes a significant difference from zero and signifies an autocorrelation.

Application of Autocorrelation

• Autocorrelation analysis is widely used in fluorescence correlation spectroscopy.
• Autocorrelation is used to measurement the optical spectra and to measure the very-short-duration light pulses produced by lasers.
• Autocorrelation is used to analyze dynamic light scattering data for the determination of the particle size distributions of nanometer-sized particles in a fluid. A laser shining into the mixture produces a speckle pattern. Autocorrelation of the signal can be analyzed in terms of the diffusion of the particles. From this, knowing the fluid viscosity, the sizes of the particles can be calculated using Autocorrelation.
• The small-angle X-ray scattering intensity of a nano-structured system is the Fourier transform of the spatial autocorrelation function of the electron density.
• In optics, normalized autocorrelations and cross-correlations give the degree of coherence of an electromagnetic field.
• In signal processing, autocorrelation can provide information about repeating events such as musical beats or pulsar frequencies, but it cannot tell the position in time of the beat. It can also be used to estimate the pitch of a musical tone.
• In music recording, autocorrelation is used as a pitch detection algorithm prior to vocal processing, as a distortion effect or to eliminate undesired mistakes and inaccuracies.
• In statistics, spatial autocorrelation between sample locations also helps one estimate mean value uncertainties when sampling a heterogeneous population.
• In astrophysics, auto-correlation is used to study and characterize the spatial distribution of galaxies in the Universe and in multi-wavelength observations of Low Mass X-ray Binaries.
• In analysis of Markov chain Monte Carlo data, autocorrelation must be taken into account for correct error determination.