The Correlogram

A correlogram is a graph used to interpret a set of autocorrelation coefficients in which $r_k$ is plotted against the $log k$. A correlogram is often very helpful for visual inspection.

Some general advice to interpret the correlogram are:

  • A Random Series: If a time series is completely random, then for large $N$, $r_k \cong 0$ for all non-zero values of $k$. A random time series $r_k$ is approximately $N\left(0, \frac{1}{N}\right)$. If a time series is random, 19 out of 20 of the values of $r_k$ can be expected to lie between $\pm \frac{2}{\sqrt{N}}$. However, plotting the first 20 values of $r_k$, one can expect to find one significant value on average even when the time series is random.
  • Short-term Correlation: Stationary series often exhibit short-term correlation characterized by a fairly large value of $r_1$ followed by 2 or 3 more coefficients (significantly greater than zero) tend to get successively smaller values of $r_k$ for larger lags tend to get be approximately zero. A time series that gives rise to such a correlogram is one for which an observation above the mean tends to be followed by one or more further observations above the mean and similarly for observation below the mean. A model called an autoregressive model may be appropriate for a series of this type.
Correlogram
  • Alternating Series: If a time series tends to alternate with successive observations on different sides of the overall mean, then the correlogram also tends to alternate. The value of $r_1$ will be negative, however, the value of $r_2$ will be positive as observation at lag 2 will tend to be on the same side of the mean.
  • Non-Stationary Series: If a time series contains a trend, then the value of $r_k$ will not come down to zero except for very large values of the lags. This is because of a large number of further observations on the same side of the mean because of the trend. The sample autocorrelation function $\{ r_k \}$ should only be calculated for stationary time series and no trend should be removed before calculating $\{ r_k\}$.
  • Seasonal Fluctuations: If a time series contains a seasonal fluctuation then the correlogram will also exhibit an oscillation at the same frequency. If $x_t$ follows a sinusoidal pattern then so does $r_k$.
    $x_t=a\, cos\, t\, w, $ where $a$ is constant, $w$ is frequency such that $0 < w < \pi$. Therefore $r_k \cong cos\, k\, w$ for large $N$.
    If the seasonal variation is removed from seasonal data then the correlogram may provide useful information.
  • Outliers: If a time series contains one or more outliers the correlogram may be seriously affected. If there is one outlier in the time series and it is not adjusted, then the plot of $x_y$ vs $x_{t+k}$ will contain two extreme points, which will tend to depress the sample correlation coefficients towards zero. If there are two outliers, this effect is more noticeable.
  • General Remarks: Experience is required to interpret autocorrelation coefficients. We need to study the probability theory of stationary series and the classes of the model too. We also need to know the sampling properties of $x_t$.

There are two main types of correlograms depending on the type of correlation being analyzed:

  • Pearson Correlation: This is the most common type and measures linear correlations between continuous variables.
  • Spearman Rank Correlation: This is a non-parametric measure suitable for ordinal or continuous data and assesses monotonic relationships (not necessarily linear).

In summary, a correlogram is a valuable tool for exploratory data analysis. It helps us:

  • Understand the relationships between multiple variables in your data.
  • Identify potential issues with multicollinearity before building statistical models.
  • Gain insights into the underlying structure of your data.
itfeature.com correlogram

Learn R Programming and R Data Analysis

Online MCQs Test

1 thought on “The Correlogram”

  1. Dr i ran an analysis were some data are stationary while other are not but somebody said i cant use both stationary and no stationary data what can i do, but at levels some data were stationary. what is your advice

    Reply

Leave a Comment

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading