Heteroscedasticity

Key Points of Heteroscedasticity (2021)

Mar 30, 2024Feb 10, 2021 by Muhammad Imdad Ullah

The following are some key points about heteroscedasticity. These key points are about the definition, example, properties, assumptions, and tests for the detection of heteroscedasticity (detection of hetero in short).

One important assumption of Regression is that the

One important assumption of Regression is that the variance of the Error Term is constant across observations. If the error has a constant variance, then the errors are called homoscedastic, otherwise heteroscedastic. In the case of heteroscedastic errors (non-constant variance), the standard estimation methods become inefficient. Typically, to assess the assumption of homoscedasticity, residuals are plotted.

The disturbance term of OLS regression $u_i$ should be homoscedastic. By Homo, we mean equal, and scedastic means spread or scatter.
By hetero, we mean unequal.
Heteroscedasticity means that the conditional variance of $Y_i$ (i.e., $var(u_i))$ conditional upon the given $X_i$ does not remain the same regardless of the values taken by the variable $X$.
In case of heteroscedasticity $E(u_i^2)=\sigma_i^2=var(u_i^2)$, where $i=1,2,\cdots, n$.
In case of Homoscedasticity $E(u_i^2)=\sigma^2=var(u_i^2)$, where $i=1,2,\cdots, n$
Homoscedasticity means that the conditional variance of $Y_i$ (i.e. $var(u_i))$ conditional upon the given $X_i$ remains the same regardless of the values taken by the variable $X$.
The error terms are heteroscedastic, when the scatter of the errors is different, varying depending on the value of one or more of the explanatory variables.
Heteroscedasticity is a systematic change in the scatteredness of the residuals over the range of measured values
The presence of outliers may be due to (i) The presence of outliers in the data, (ii) incorrect functional form of the regression model, (iii) incorrect transformation of the data, and (iv) missing observations with different measures of scale.
The presence of hetero does not destroy the unbiasedness and consistency of OLS estimators.
Hetero is more common in cross-section data than time-series data.
Hetero may affect the variance and standard errors of the OLS estimates.
The standard errors of OLS estimates are biased in the case of hetero.
Statistical inferences (confidence intervals and hypothesis testing) of estimated regression coefficients are no longer valid.
The OLS estimators are no longer BLUE as they are no longer efficient in the presence of hetero.
The regression predictions are inefficient in the case of hetero.
The usual OLS method assigns equal weights to each observation.
In GLS the weight assigned to each observation is inversely proportional to $\sigma_i$.
In GLS a weighted sum of squares is minimized with weight $w_i=\frac{1}{\sigma_i^2}$.
In GLS each squared residual is weighted by the inverse of $Var(u_i|X_i)$
GLS estimates are BLUE.
Heteroscedasticity can be detected by plotting an estimated $u_i^2$ against $\hat{Y}_i$.
Plotting $u_i^2$ against $\hat{Y}_i$, if no systematic pattern exists then there is no hetero.
In the case of prior information about $\sigma_i^2$, one may use WLS.
If $\sigma_i^2$ is unknown, one may proceed with heteroscedastic corrected standard errors (that are also called robust standard errors).
Drawing inferences in the presence of hetero (or if hetero is suspected) may be very misleading.

MCQs Online Website with Answers: https://gmstat.com

R Frequently Asked Questions

The Breusch-Pagan Test (Numerical Example)

Apr 7, 2024Feb 5, 2021 by Muhammad Imdad Ullah

To perform the Breusch-Pagan test for the detection of heteroscedasticity, use the data from the following file Table_11.3.

Step 1:

The estimated regression is $\hat{Y}_i = 9.2903 + 0.6378X_i$

Step 2:

The residuals obtained from this regression are:

$\hat{u}_i$	$\hat{u}_i^2$	$p_i$
-5.31307	28.22873	0.358665
-8.06876	65.10494	0.827201
6.49801	42.22407	0.536485
0.55339	0.30624	0.003891
-6.82445	46.57318	0.591743
1.36447	1.86177	0.023655
5.79770	33.61333	0.427079
-3.58015	12.81744	0.162854
0.98662	0.97342	0.012368
8.30908	69.04085	0.877209
-2.25769	5.09715	0.064763
-1.33584	1.78446	0.022673
8.04201	64.67391	0.821724
10.47524	109.73066	1.3942
6.23093	38.82451	0.493291
-9.09153	82.65588	1.050197
-12.79183	163.63099	2.079039
-16.84722	283.82879	3.606231
-17.35860	301.32104	3.828481
2.71955	7.39595	0.09397
2.39709	5.74604	0.073007
0.77494	0.60052	0.00763
9.45248	89.34930	1.135241
4.88571	23.87014	0.303286
4.53063	20.52658	0.260804
-0.03614	0.00131	1.66E-05
-0.30322	0.09194	0.001168
9.50786	90.39944	1.148584
-18.98076	360.26909	4.577455
20.26355	410.61159	5.217089

The estimated $\tilde{\sigma}^2$ is $\frac{\sum u_i^2}{n} = \frac{2361.15325}{30} = 78.7051$.

Compute a new variable $p_i = \frac{\hat{u}_i^2}{\hat{\sigma^2}}$

Step 3:

Assuming $p_i$ is linearly related to $X_i(=Z_i)$ and run the regression of $p_i=\alpha_1+\alpha_2Z_{2i}+v_i$.

The regression Results are: $\hat{p}_i=-0.74261 + 0.010063X_i$

Step 4:

Obtain the Explained Sum of Squares (ESS) = 10.42802.

Step 5:

Compute: $\Theta = \frac{1}{2} ESS = \frac{10.42802}{2}= 5.2140$.

The Breusch-Pagan test follows Chi-Square Distribution. The $\chi^2_{tab}$ value at a 5% level of significance and with ($k-1$) one degree of freedom is 3.8414. The $\chi_{cal}^2$ is greater than $\chi_{tab}^2$, therefore, results are statistically significant. There is evidence of heteroscedasticity at a 5% level of significance.

Bruesch-Pagan-Test-of-Heteroscedasticity — Heteroscedasticity Residual Plot 1

See More about the Breusch-Pagan Test

White General Heteroscedasticity Test (Numerical) 2021

May 2, 2024Jan 20, 2021 by Muhammad Imdad Ullah

Read about Heteroscedasticity Consequences in detail.

white general heteroscedasticity test https://itfeature.com

We will consider the following data, to test the presence of heteroscedasticity using White General Heteroscedasticity test.

Income	Education	Job Experience
5	2	9
9.7	4	18
28.4	8	21
8.8	8	12
21	8	14
26.6	10	16
25.4	12	16
23.1	12	9
22.5	12	18
19.5	12	5
21.7	12	7
24.8	13	9
30.1	14	12
24.8	14	17
28.5	15	19
26	15	6
38.9	16	17
22.1	16	1
33.1	17	10
48.3	21	17

White General Heteroscedasticity Test

To perform the White General Heteroscedasticity test, the general procedure is

Step 1: Run a regression and obtain $\hat{u}_i$ of this regression equation.

The regression model is: $income = \beta_1+\beta_2\, educ + \beta_3\, jobexp + u_i$

The Regression results are: $Income_i=-7.09686 + 1.93339 educ_{i} + 0.649365 jobexp_{i}$

Step 2: Run the following auxiliary regression

$$\hat{u}_i^2=\alpha_1+\alpha_2X_{2i}+\alpha_3 X_{3i}+\alpha_4 X_{2i}^2+\alpha_5X_{3i}^2+\alpha_6X_{2i}X_{3i}+vi $$

that is, regress the squared residuals on a constant, all the explanatory variables, the squared explanatory variables, and their respective cross-product.

Here in auxiliary regression education, $Y$ is income, $X_2$ is educ, and $X_3$ is jobexp.

The results from auxiliary regression are:

$$Y=42.6145 -0.10872\,X_{2i} – 5.8402\, X_{3i} -0.15273\, X_{2i}^2 + 0.200715\, X_{3i}^2 + 0.226517\,X_{2i}X_{3i}$$

Step 3: Formulate the null and alternative hypotheses

$H_0: \alpha_1=\alpha_2=\cdots=\alpha_p=0$

$H_1$: at least one of the $\alpha$s is different from zero

Step 4: Reject the null and conclude that there is significant evidence of heteroscedasticity when the statistic is bigger than the critical value.

The statistic with computed value is:

$$n \cdot R^2 \, \Rightarrow = 20\times 0.4488 = 8.977$$

The statistics follow asymptotically $\chi^2_{df}$, where $df=k-1$. The Critical value is $\chi^2_5$ at a 5% level of significance is 11.07.

Since the calculated value is smaller than the tabulated value, therefore, the null hypothesis is accepted. Therefore, based on the White general heteroscedasticity test, there is no heteroscedasticity.

Download the data file: White’s test Related Data

Online MCQs Quiz Website