Category: Heteroscedasticity

Key Points of Heteroscedasticity

The following are some key points about heteroscedasticity. These key points of heteroscedasticity are about the definition, example, properties, assumptions, and tests for the detection of heteroscedasticity.

  • The disturbance term of OLS regression $u_i$ should be homoscedastic.
  • By Homo, we mean equal and scedastic means spread or scatter.
  • By hetero, we mean unequal.
  • Heteroscedasticity means that the conditional variance of $Y_i$ (i.e., $var(u_i))$ conditional upon the given $X_i$ does not remain the same regardless of the values taken by the variable $X$.
  • In case of heteroscedasticity $E(u_i^2)=\sigma_i^2=var(u_i^2)$, where $i=1,2,\cdots, n$.
  • In case of Homoscedasticity $E(u_i^2)=\sigma^2=var(u_i^2)$, where $i=1,2,\cdots, n$
    Homoscedasticity means that the conditional variance of $Y_i$ (i.e. $var(u_i))$ conditional upon the givne $X_i$ remains the same regardless the values taken by the variable $X$.
  • The error terms are heteroscedastic, when the scatter of the errors is different, varying depending on the value of one or more of the explanatory variable,
  • Heteroscedasticity is a systematic change in the scatteredness of the residuals over the range of measured values
  • The presence of outliers may be due to: (i) The presence of outliers in the data, (ii) incorrect functional form of the regression model, (iii) incorrect transformation of the data, and (iv) missing observations with different measures of scale.
  • The presence of heteroscedasticity does not destroy the unbiasedness and consistency of OLS estimators.
  • Heteroscedasticity is more common in cross-section data than time-series data.
  • Heteroscedasticity may affect the variance and standard errors of the OLS estimates.
  • The standard errors of OLS estimates are biased in the case of heteroscedasticity.
  • Statistical inferences (confidence intervals and hypothesis testing) of estimated regression coefficients are no longer valid.
  • The OLS estimators are no longer BLUE as they are no longer efficient in the presence of heteroscedasticity.
  • The regression predictions are inefficient in the case of heteroscedasticity.
  • The usual OLS method assigns equal weights to each observation.
  • In GLS the weight assigned to each observation is inversely proportional to is $\sigma_i$.
  • In GLS a weighted sum of squares are minimized with weight $w_i=\frac{1}{\sigma_i^2}$ .
  • In GLS each squared residual is weighted by the inverse of $Var(u_i|X_i)$
  • GLS estimates are BLUE.
  • Heteroscedasticity can be detected by plotting an estimated $u_i^2$ against $\hat{Y}_i$.
  • Plotting $u_i^2$ against $\hat{Y}_i$, if no systematic pattern exists then there is no heteroscedasticity.
  • In the case of prior information about $\sigma_i^2$, one may use WLS.
  • If $\sigma_i^2$ is unknown, one may proceed with heteroscedastic corrected standard errors (that are also called robust standard errors).
  • Drawing inference in the presence of heteroscedasticity (or if hetero is suspected) may be very misleading.

See more Different topics related to Heteroscedasticity.

The Breusch-Pagan Test (Numerical Example)

To perform the Breusch-Pagan test for the detection of heteroscedasticity, use the data from the following file Table_11.3.

Step 1:

The estimated regression is $\hat{Y}_i = 9.2903 + 0.6378X_i$

Step 2:

The residuals obtained from this regression are:

$\hat{u}_i$$\hat{u}_i^2$$p_i$
-5.3130728.228730.358665
-8.0687665.104940.827201
6.4980142.224070.536485
0.553390.306240.003891
-6.8244546.573180.591743
1.364471.861770.023655
5.7977033.613330.427079
-3.5801512.817440.162854
0.986620.973420.012368
8.3090869.040850.877209
-2.257695.097150.064763
-1.335841.784460.022673
8.0420164.673910.821724
10.47524109.730661.3942
6.2309338.824510.493291
-9.0915382.655881.050197
-12.79183163.630992.079039
-16.84722283.828793.606231
-17.35860301.321043.828481
2.719557.395950.09397
2.397095.746040.073007
0.774940.600520.00763
9.4524889.349301.135241
4.8857123.870140.303286
4.5306320.526580.260804
-0.036140.001311.66E-05
-0.303220.091940.001168
9.5078690.399441.148584
-18.98076360.269094.577455
20.26355410.611595.217089

The estimated $\tilde{\sigma}^2$ is $\frac{\sum u_i^2}{n} = \frac{2361.15325}{30} = 78.7051$.

Compute a new variable $p_i = \frac{\hat{u}_i^2}{\hat{\sigma^2}}$

Step 3:

Assuming $p_i$ is linearly related to $X_i(=Z_i)$ and run the regression of $p_i=\alpha_1+\alpha_2Z_{2i}+v_i$.

The regression Results are: $\hat{p}_i=-0.74261 + 0.010063X_i$

Step 4:

Obtain the Explained Sum of Squares (ESS) = 10.42802.

Step 5:

Compute: $\Theta = \frac{1}{2} ESS = \frac{10.42802}{2}= 5.2140$.

The Breusch-Pagan test follows Chi-Square Distribution. The $\chi^2_{tab}$ value at a 5% level of significance and with ($k-1$) one degrees of freedom is 3.8414. The $\chi_{cal}^2$ is greater than $\chi_{tab}^2$, therefore, results are statistically significant. There is evidence of heteroscedasticity at a 5% level of significance.

See More about Breusch-Pagan Test

Bruesch-Pagan-Test-of-Heteroscedasticity

White General Heteroscedasticity Test (Numerical Example)

We will consider the following data, to test the presence of heteroscedasticity using the White General Heteroscedasticity test.

incomeeducjobexp
529
9.7418
28.4821
8.8812
21814
26.61016
25.41216
23.1129
22.51218
19.5125
21.7127
24.8139
30.11412
24.81417
28.51519
26156
38.91617
22.1161
33.11710
48.32117

To perform the white test, the general procedure is

Step 1: Run a regression and obtain $\hat{u}_i$ of this regression equation.

The regression model is: $income = \beta_1+\beta_2\, educ + \beta_3\, jobexp + u_i$

The Regression results are: $Income_i=-7.09686 + 1.93339 educ_{i} + 0.649365 jobexp_{i}$

Step 2: Run the following auxiliary regression

$$\hat{u}_i^2=\alpha_1+\alpha_2X_{2i}+\alpha_3 X_{3i}+\alpha_4 X_{2i}^2+\alpha_5X_{3i}^2+\alpha_6X_{2i}X_{3i}+vi $$

that is, regress the squared residuals on a constant, all the explanatory variables, the squared explanatory variables, and their respective cross-product.

Here in auxiliary regression education, $Y$ is income, $X_2$ is educ, and $X_3$ is jobexp.

The results from auxiliary regression are:

$$Y=42.6145  -0.10872\,X_{2i} – 5.8402\, X_{3i} -0.15273\, X_{2i}^2 + 0.200715\, X_{3i}^2 + 0.226517\,X_{2i}X_{3i}$$

Step 3: Formulate the null and alternative hypotheses

$H_0: \alpha_1=\alpha_2=\cdots=\alpha_p=0$

$H_1$: at least one of the $\alpha$s is different from zero

Step 4: Reject the null and conclude that there is significant evidence of heteroscedasticity when the statistic is bigger than the critical value.

The statistic with computed value is:

$$n \cdot R^2 \, \Rightarrow = 20\times 0.4488 = 8.977$$

The statistics follow asymptotically $\chi^2_{df}$, where $df=k-1$. The Critical value is $\chi^2_5$ at 5% level of significance is  11.07. 

Since the calculated value is smaller than the tabulated value, therefore, the null hypothesis is accepted. Therefore, on the basis of the White general heteroscedasticity test, there is no heteroscedasticity.

Download the data file: White’s test Related Data

Park Glejser Test: Numerical Example

To detect the presence of heteroscedasticity using the Park Glejser test, consider the following data.

Year1992199319941995199619971998
Yt37484536255563
Xt4.56.53.532.58.57.5

The step by step procedure of conducting Park Glejser test:

Step 1: Obtain estimate the regression equation

$$\hat{Y}_i = 19.8822 + 4.7173X_i$$

Obtain the residuals from this estimated regression equation:

Residuals-4.1103-2.54508.60711.9657-6.6756-4.97977.7377

Take the absolute values of these residuals and consider it as your dependent variables to perform the different functional forms suggested by Glejser.

Step 2: Regress the absolute values of $\hat{u}_i$ on the $X$ variable that is thought to be closely associated with $\sigma_i^2$. We will use the following function forms.

 Functional FormResults
1)$|\hat{u}_t| = \beta_1 + \beta_2 X_i +v_i$

$|\hat{u}_i| = 5.2666-0.00681X_i,\quad R^2=0.00004$

$t_{cal} = -0.014$

   
2)$|\hat{u}_t| = \beta_1 + \beta_2 \sqrt{X_i} +v_i$

$|\hat{u}_i| = 5.445-0.0962X_i,\quad R^2=0.000389$

$t_{cal} = -0.04414$

   
3)$|\hat{u}_t| = \beta_1 + \beta_2 \frac{1}{X_i} +v_i$

$||\hat{u}_i| = 4.9124+1.3571X_i,\quad R^2=0.00332$

$t_{cal} = -0.12914$

   
4)$|\hat{u}_t| = \beta_1 + \beta_2 \frac{1}{\sqrt{X_i}} +v_i$

$\hat{u}_i| = 4.7375+1.0428X_i,\quad R^2=0.00209$

$t_{cal} = 0.10252$

Since none of the residual regression is significant, therefore, the hypothesis of heteroscedasticity is rejected. Therefore, we can say that there is no relationship between the absolute value of the residuals ($u_i$) and the explanatory variable $X$.

Error Variance is Proportional to Xi: Park Glejser Test

How to perform White General Heteroscedasticity?

x Logo: Shield Security
This Site Is Protected By
Shield Security