# Heteroscedasticity in Regression

### Heteroscedasticity in Regression

Heteroscedasticity in Regression: The term heteroscedasticity refers to the violation of the assumption of homoscedasticity in linear regression models (LRM). In the case of heteroscedasticity, the errors have unequal variances for different levels of the regressors, which leads to biased and inefficient estimators of the regression coefficients. The disturbances in the Classical Linear Regression Model (CLRM) appearing in the population regression function should be homoscedastic; that is they all have the same variance.

Mathematical Proof of $E(\hat{\sigma}^2)\ne \sigma^2$ when there is some presence of hetero in the data.

For the proof of $E(\hat{\sigma}^2)\ne \sigma^2$, consider the two-variable linear regression model in the presence of heteroscedasticity,

\begin{align}
\end{align}

where $Var(u_i)=\sigma_i^2$ (Case of heteroscedasticity)

as

\begin{align}
\hat{\sigma^2} &= \frac{\sum \hat{u}_i^2 }{n-2}\\
&= \frac{\sum (Y_i – \hat{Y}_i)^2 }{n-2}\\
&=\frac{(\beta_1 + \beta_2 X_i + u_i – \hat{\beta}_1 -\hat{\beta}_2 X_i )^2}{n-2}\\
&=\frac{\sum \left( -(\hat{\beta}_1-\beta_1) – (\hat{\beta}_2 – \beta_2)X_i + u_i \right)^2 }{n-2}\quad\quad (eq2)
\end{align}

Noting that

\begin{align*}
(Y_i-\hat{Y}_i)&=0\\
\beta_1 + \beta_2 X + u_i\, – \,\hat{\beta}_1 – \hat{\beta}_2X &=0\\
-(\hat{\beta}_1 -\beta_1) – X(\hat{\beta}_2-\beta_2) – u_i & =0\\
(\hat{\beta}_1 -\beta_1) &= – X (\hat{\beta}_2-\beta_2) + u_i\\
\text{Applying summation on both side}&\\
\sum (\hat{\beta}_1-\beta_1) &= -(\hat{\beta}_2-\beta_2)\sum X + \sum u_i\\
(\hat{\beta}_1 – \beta_1) &= -(\hat{\beta}_2-\beta_2)\overline{X}+\overline{u}
\end{align*}

Substituting it in (eq2) and taking expectation on both sides:

\begin{align}
\hat{\sigma}^2 &= \frac{1}{n-2} \left[ -(-(\hat{\beta}_2 – \beta_2) \overline{X} + \overline{u} ) – (\hat{\beta}_2-\beta_2)X_i + u_i  \right]^2\\
&=\frac{1}{n-2}E\left[(\hat{\beta}_2-\beta_2)\overline{X} -\overline{u} – (\hat{\beta}_2-\beta_2)X_i-u_i \right]^2\\
&=\frac{1}{n-2} E\left[ -(\hat{\beta}_2 – \beta_2)(X_i-\overline{X}) + (u_i-\overline{u})\right]^2\\
&= \frac{1}{n-2}\left[-\sum x_i^2 Var(\hat{\beta}_2) + E[\sum(u_i-\overline{u}]^2 \right]\\
&=\frac{1}{n-2} \left[ -\frac{\sum x_i^2 \sigma_i^2}{(\sum x_i^2)} + \frac{(n-1)\sum \sigma_i^2}{n} \right]
\end{align}

If there is homoscedasticity, then $\sigma_i^2=\sigma^2$ for each $i$, $E(\hat{\sigma}_i^2)=\sigma^2$.

The expected value of the $\hat{\sigma}^2=\frac{\hat{u}_i^2}{n-2}$ will not be equal to the true $\sigma^2$ in the presence of heteroscedasticity.

Read more on the Remedy of Heteroscedasticity

MCQs General Knowledge

R Programming Language

This site uses Akismet to reduce spam. Learn how your comment data is processed.