# Heteroscedasticity

One of the assumption of classical linear regression model is that there is no heteroscedasticity (error terms has constant error term) meaning that ordinary least square (OLS) estimators are (BLUE, best linear unbiased estimator) and their variances is the lowest of all other unbiased estimators (Gauss Markov Theorem). If the assumption of constant variance does not hold then this means that the Gauss Markov Theorem does not apply. For heteroscedastic data, regression analysis provide unbiased estimate for the relationship between the predictors and the outcome variables.

As we have discussed that heteroscedasticity occurs when the error variance has non-constant variance.  In this case, we can think of the disturbance for each observation as being drawn from a different distribution with a different variance.  Stated equivalently, the variance of the observed value of the dependent variable around the regression line is non-constant.  We can think of each observed value of the dependent variable as being drawn from a different conditional probability distribution with a different conditional variance. A general linear regression model with the assumption of heteroscedasticity can be expressed as follows

\begin{align*}
y_i & = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \cdots + \beta_p X_ip + \varepsilon_i\\
Var(\varepsilon_i)&=E(\varepsilon_i^2)\\
&=\sigma_i^2; \cdots i=1,2,\cdots, n
\end{align*}

Note that we have a $i$ subscript attached to sigma squared.  This indicates that the disturbance for each of the $n$-units is drawn from a probability distribution that has a different variance.

If the error term has non-constant variance, but all other assumptions of the classical linear regression model are satisfied, then the consequences of using the OLS estimator to obtain estimates of the population parameters are:

• The OLS estimator is still unbiased
• The OLS estimator is inefficient; that is, it is not BLUE
• The estimated variances and covariances of the OLS estimates are biased and inconsistent
• Hypothesis tests are not valid

## Detection of Heteroscedasticity Regression Residual Plot

The residual for the $i$th observation, $\hat{\varepsilon_i}$, is an unbiased estimate of the unknown and unobservable error for that observation, $\hat{\varepsilon_i}$. Thus the squared residuals, $\hat{\varepsilon_i^2}$ , can be used as an estimate of the unknown and unobservable error variance,  $\sigma_i^2=E(\hat{\varepsilon_i})$.  You can calculate the squared residuals and then plot them against an explanatory variable that you believe might be related to the error variance.  If you believe that the error variance may be related to more than one of the explanatory variables, you can plot the squared residuals against each one of these variables.  Alternatively, you could plot the squared residuals against the fitted value of the dependent variable obtained from the OLS estimates.  Most statistical programs (softwares) have a command to do these residual plots.  It must be emphasized that this is not a formal test for heteroscedasticity.  It would only suggest whether heteroscedasticity may exist.

Below there are residual plots showing the three typical patterns. The first plot shows a random pattern that indicates a good fit for a linear model. The other two plot patterns of residual plots are non-random (U-shaped and inverted U), suggesting a better fit for a non-linear model, than linear regression model.

Heteroscedasticity Regression Residual Plot 1

Heteroscedasticity Residual Residual Plot 2

Heteroscedasticity Regression Residual Plot 3

## Consequences of Heteroscedasticity for OLS

When heteroscedasticity is present in data, then estimates based on Ordinary Least Square (OLS) are subjected to following consequences:

1. We cannot apply the formula of the variance of the coefficients to conduct tests of significance and construct confidence intervals.
2. If error term ($\mu_i$) is heteroscedastic, then the OLS estimates do not have the minimum variance property in the class of unbiased estimators, i.e. they are inefficient in small samples. Furthermore they are asymptotically inefficient.
3. The estimated coefficients remain unbiased statistically. That means the property of unbiasedness of OLS estimation is not violated by the presence of heteroscedasticity.
4. The forecasts based on the model with heteroscedasticity will be less efficient as OLS estimation yield higher values of the variance of the estimated coefficients.

All this means the standard errors will be underestimated and the t-statistics and F-statistics will be inaccurate, caused by a number of factors, but the main cause is when the variables have substantially different values for each observation. For instance GDP will suffer from heteroscedasticity if we include large countries such as the USA and small countries such as Cuba. In this case it may be better to use GDP per person. Also note that heteroscedasticity tends to affect cross-sectional data more than time series.

Consider the simple linear regression model (slrm)

The OLS estimate of $\hat{\beta}$ and $\alpha$ are

\begin{align*}
\hat{\beta}&=\frac{\sum x_i y_i}{\sum x_i^2}=\frac{\sum x_i (\beta x_i +\epsilon_i)}{\sum x_i^2}\\
&=\beta\frac{\sum x_i^2}{\sum x_i^2}+\frac{\sum x_i \epsilon_i}{\sum x_i^2}\\
&=\beta + \frac{\sum x_i \epsilon_i}{\sum x_i^2}
\end{align*}

Applying expectation on both sides we get:

$E(\hat{\beta}=\beta+\frac{\sum E(x_i \epsilon_i)}{\sum x_i^2}=\beta \qquad E(\epsilon_i x_i)=0$

Similarly

\begin{align*}\hat{\alpha}&=\overline{y}-\hat{\beta}\overline{X}\\
&=\alpha+\beta\overline{X}+\overline{\epsilon}-\hat{\beta}\overline{X}\\
&=\alpha+\beta\overline{X}+0-\overline{X}\beta=\alpha
\end{align*}

Hence, unbiasedness property of OLS estimation is not affected by Heteroscedasticity.

References:

## White test for Heteroskedasticity detection

One of important assumption of Regression is that the variance of Error Term is constant across observations. If the error have constant variance, then the errors are called homoscedastic, otherwise heteroscedastic. In case of heteroscedastic errors (non-constant variance), the standard estimation methods becomes inefficient. Typically, to assess the assumption of homoscedasticity, residuals are plotted.

## White’s test for Heteroskedasticity

White test (Halbert White, 1980) proposed a test which is vary similar to that by Breusch-Pagen. White test for Heteroskedasticity is general because it do not rely on the normality assumptions and it is also easy to implement. Because of the generality of White’s test, it may identify the specification bias too. Both White’s test and the Breusch-Pagan test are based on the residuals of the fitted model.

To test the assumption of homoscedasticity, one can use auxiliary regression analysis by regressing the squared residuals from the original model on set of original regressors, the cross-products of the regressors and the squared regressors.

Step by step procedure or perform White test for Heteroskedasticity is as follows:

Consider the following Linear Regression Model (assume there are two independent variable)
$Y_i=\beta_0+\beta_1X_{1i}+\beta_1X_{2i}+e_i \tag{1}$

For given data, estimate the regression model and obtain the residuals $e_i$’s.

1. Now run the following regression model to obtain squared residuals from original regression on the original set of independent variable, square value of independent variables and the cross-product(s) of the independent variable(s) such as
$Y_i=\beta_0+\beta_1X_1+\beta_2X_2+\beta_3X_1^2+\beta_4X_2^2+\beta_5X_1X_2 \tag{2}$
2. Find the $R^2$ statistics from the auxiliary regression in step 2.
You can also use higher power of regressors such as cube. Also note that there will be constant term in equation (2) even though the original regression model (1)may or may not have the constant term.
3. Test the statistical significance of $n \times R^2\sim\chi^2_{df}\tag{3},$ under the null hypothesis of homoscedasticity or no heteroscedasticity, where df is number of regressors in equation (2)
4. If calculated chi-square value obtained in (3) is greater than the critical chi-square value at chosen level of significance, reject the hypothesis of homoscedasticity in favour of heteroscedasticity.

Note that the regression of residuals can take linear or non-linear functional form.

For several independent variables (regressors) model, introducing all the regressors, their square or higher terms and their cross products, consume degrees of freedom.

In cases where the White test statistics is statistically significant, heteroscedasticity may not necessarily be the cause, but specification errors. In other words, “The white test can be a test of heteroscedasticity or specification error or both. If no cross product terms are introduced in the White test procedure, then this is a pure test of pure heteroscedasticity.
If cross product are introduced in model, then it is a test of both heteroscedasticity and specification bias.

### References

• H. White (1980), “A heteroscedasticity Consistent Covariance Matrix Estimator and a Direct Test of Heteroscedasticity”, Econometrica, Vol. 48, pp. 817-818.
• https://en.wikipedia.org/wiki/White_test

# Remedial Measures for Heteroscedasticity

The OLS estimators remains unbiased and consistent in the presence of Heteroscedasticity, but they are no longer efficient not even asymptotically. This lack of efficiency makes the usual hypothesis testing procedure of dubious value. Therefore remedial measures may be called. There are two approaches for remedial measures for heteroscedasticity

## (i) $\sigma_i^2$ is known

Consider the simple linear regression model Yi=α+βXii.

If $V(\mu_i)=\sigma_i^2$ then heteroscedasticity is present. Given the values of $\sigma_i^2$ heteroscedasticity can be corrected by using weighted least squares (WLS) as a special case of Generalized Least Square (GLS). Weighted least squares is the OLS method of estimation applied to the transformed model.

When heteroscedasticity is detected by any appropriate statistical test, then appropriate solution is transform the original model in such a way that the transformed disturbance term has constant variance. The transformed model reduces to the adjustment of the original data. The transformed error term μi has a constant variance i.e. homoscedastic. Mathematically

\begin{eqnarray*}
V(\mu_i^*)&=&V\left(\frac{\mu_i}{\sigma_i}\right)\\
&=&\frac{1}{\sigma_i^2}Var(\mu_i)\\
&=&\frac{1}{\sigma_i^2}\sigma_i^2=1
\end{eqnarray*}

This approach has its limited use as the individual error variance are not always known a priori. In case of significant sample information, reasonable guesses of the true error variances can be made and be used for $\sigma_i^2$.

## (ii) When $\sigma_i^2$ is unknown

If $\sigma_i^2$ is not known a priori, then heteroscedasticity is corrected by hypothesizing a relationship between the error variance and one of the explanatory variables. There can be several versions of the hypothesized relationship. Suppose hypothesized relationship is $Var(\mu)=\sigma^2 X_i^2$ (error variance is proportional to $X_i^2$). For this hypothesized relation we will use the following transformation to correct for heteroscedasticity for the following simple linear regression model Yi=α+βXii.
\begin{eqnarray*}
\frac{Y_i}{X_i}&=&\frac{\alpha}{X_i}+\beta+\frac{\mu_i}{X_i}\\
\Rightarrow \quad Y_i^*&=&\beta +\alpha_i^*+\mu_i^*\\
\mbox{where } Y_i^*&=&\frac{Y_i}{X_i}, \alpha_I^*=\frac{1}{X_i} \mbox{and  } \mu_i^*=\frac{\mu}{X_i}
\end{eqnarray*}

Now the OLS estimation of the above transformed model will yield the efficient parameter estimates as $\mu_i^*$’s have constant variance. i.e.

\begin{eqnarray*}
V(\mu_i^*)&=&V(\frac{\mu_i}{X_i})\\
&=&\frac{1}{X_i^2} V(\mu_i^2)\\
&=&\frac{1}{X_i^2}\sigma^2X_i^2\\
&=&\sigma^2=\mbox{ Constant}
\end{eqnarray*}

For correction of heteroscedasticity some other hypothesized relations are

• Error variance is proportional to Xi (Square root transformation) i.e $E(\mu_i^2)=\sigma^2X_i$
The transformed model is
$\frac{Y_i}{\sqrt{X_i}}=\frac{\alpha}{\sqrt{X_i}}+\beta\sqrt{X_i}+\frac{\mu_i}{\sqrt{X_i}}$
It (transformed model) has no intercept term. Therefore we have to use the regression through the origin model to estimate $\alpha$ and β.To get original model, multiply $\sqrt{X_i}$ with transformed model.
• Error Variance is proportional to the square of the mean value of Y. i.e. $E(\mu_i^2)=\sigma^2[E(Y_i)]^2$
Here the variance of $\mu_i$ is proportional to the square of the expected value of Y, and E(Yi)=α+βxi.
The transformed model will be
$\frac{Y_i}{E(Y_i)}=\frac{\alpha}{E(Y_i)}+\beta\frac{X_i}{E(Y_i)}+\frac{\mu_i}{E(Y_i)}$
This transformation is not appropriate because E(Yi) depends upon $\alpha$ and β which are unknown parameters. $\hat{Y_i}=\hat{\alpha}+\hat{\beta}$ is an estimator of E(Yi), so we will proceed in two steps

1. We run the usual OLS regression dis-regarding the heteroscedasticity problem and obtain $\hat{Y_i}$
2. We will transform the model by using estimated $\hat{Y_i}$ i.e. $\frac{Y_i}{\hat{Y_i}}=\alpha\frac{1}{\hat{Y_i}}+\beta_1\frac{X_i}{\hat{Y_i}}+\frac{\mu_i}{\hat{Y_i}}$ and run the regression on transformed model.

This transformation will perform satisfactory results only if the sample size is reasonably large.

• Log transformation such as ln Yi=α+β ln Xii
Log transformation compresses the scales in which the variables are measured. But this transformation is not applicable if some of the Y and X values are zero or negative.

## Homoscedasticity: Assumption of constant variance of random variable μ (error term)

The assumptions about the random variable μ (error term) is that its probability distribution remains the same for all observations of X and in particular that the variance of each μ is the same for all values of the explanatory variables, i.e the variance of errors is the same across all levels of the independent variables. Symbolically it can be represented as

$Var(\mu) = E\{\mu_i – E(\mu)\}^2 = E(\mu_i)^2 = \sigma_\mu^2 = \mbox(Constant)$

This assumption is known as the assumption of homoscedasticity or the assumption of constant variance of the error term μ‘s. It means that the variation of each μi around its zero means does not depend on the values of X (independent) because error term expresses the influence on the dependent variables due to

• Errors in measurement
The errors of measurement tend to be cumulative over time. It is also difficult to collect the data and check its consistency and reliability. So the variance of μi increases with increasing the values of X.
• Omitted variables
Omitted variables from the function (regression model) tends to change in the same direction with X, causing an increase of the variance of the observation from the regression line.

The variance of each μi remains the same irrespective of small or large values of the explanatory variable i.e. $\sigma_\mu^2$ is not function of Xi i.e $\sigma_{\mu_i^2} \ne f(X_i)$.

## Consequences if Homoscedasticity is not meet

If the assumption of homoscedastic disturbance (Constant Variance) is not fulfilled, following consequence we have

1. We cannot apply the formula of the variance of the coefficient to conduct tests of significance and construct confidence intervals. The tests are inapplicable $Var(\hat{\beta}_0)=\sigma_\mu^2 \{\frac{\sum X^2}{n \sum X^2}\}$ and $Var(\hat{\beta}_1) = \sigma_\mu^2 \{\frac{1}{\sum X^2}\}$
2. If μ (error term) is heteroscedastic the OLS (Ordinary Least Square) estimates do not have minimum variance property in the class of Unbiased Estimators i.e they are inefficient in small samples. Furthermore they are inefficient in large samples (asymptotically inefficient).
3. The coefficient estimates would still be statistically unbiased even if the μ‘s are heteroscedastic. The $\hat{\beta}$’s will have no statistical bias i.e $E(\beta_i)=\beta_i$ (coefficient’s expected values will be equal to the true parameter value).
4. The prediction would be inefficient, because of the variance of prediction includes the variance of μ and of the parameter estimates which are not minimal due to the incidence of heteroscedasticity i.e. The prediction of Y for a given value of X based on the estimates $\hat{\beta}$’s from the original data, would have a high variance.

## Tests for Homoscedasticity

Some tests commonly used for testing the assumption of homoscedasticity are:

• Spearman Rank-Correlation test
• Goldfeld and Quandt test
• Glejser test
• Breusch–Pagan test
• Bartlett’s test of Homoscedasticity

Reference:
A. Koutsoyiannis (1972). “Theory of Econometrics”. 2nd Ed.