An important assumption of OLS is that the disturbances *μ*_{i} appearing in the population regression function are homoscedastic (Error term have same variance).

i.e. The variance of each disturbance term *μ*_{i}, conditional on the chosen values of explanatory variables is some constant number equal to $\sigma^2$. $E(\mu_{i}^{2})=\sigma^2$; where $i=1,2,\cdots, n$.

Homo means equal and scedasticity means spread.

Consider the general linear regression model

\[y_i=\beta_1+\beta_2 x_{2i}+ \beta_3 x_{3i} +\cdots + \beta_k x_{ki} + \varepsilon\]

If $E(\varepsilon_{i}^{2})=\sigma^2$ for all $i=1,2,\cdots, n$ then the assumption of constant variance of the error term or homoscedasticity is satisfied.

If $E(\varepsilon_{i}^{2})\ne\sigma^2$ then assumption of homoscedasticity is violated and heteroscedasticity is said to be present. In case of heteroscedasticity the OLS estimators are unbiased but inefficient.

**Examples:**

- The range in family income between the poorest and richest family in town is the classical example of heteroscedasticity.
- The range in annual sales between a corner drug store and general store.

*Reasons of Heteroscedasticity*

There are several reasons when the variances of error term *μ*_{i} may be variable, some of which are:

- Following the error learning models, as people learn their error of behaviors becomes smaller over time. In this case $\sigma_{i}^{2}$ is expected to decrease. For example the number of typing errors made in a given time period on a test to the hours put in typing practice.
- As income grow, people have more discretionary income and hence $\sigma_{i}^{2}$ is likely to increase with income.
- As data collecting techniques improves, $\sigma_{i}^{2}$ is likely to decrease.
- Heteroscedasticity can also arises as a result of the presence of outliers. The inclusion or exclusion of such observations, especially when the sample size is small, can substantially alter the results of regression analysis.
- Heteroscedasticity arises from violating the assumption of CLRM (classical linear regression model), that the regression model is not correctly specified.
- Skewness in the distribution of one or more regressors included in the model is another source of heteroscedasticity.
- Incorrect data transformation, incorrect functional form (linear or log-linear model) is also the source of heteroscedasticity

*Consequences of Heteroscedasticity*

- The OLS estimators and regression predictions based on them remains unbiased and consistent.
- The OLS estimators are no longer the BLUE (Best Linear Unbiased Estimators) because they are no longer efficient, so the regression predictions will be inefficient too.
- Because of the inconsistency of the covariance matrix of the estimated regression coefficients, the tests of hypotheses, (t-test, F-test) are no longer valid.

**Note:** Problems of heteroscedasticity is likely to be more common in cross-sectional than in time series data.

**Reference**

Greene, W.H. (1993) *Econometric Analysis*, Prentice–Hall, ISBN 0-13-013297-7.

Verbeek, Marno (2004) *A Guide to Modern Econometrics*, 2. ed., Chichester: John Wiley & Sons