White Test of Heteroscedasticity Detection (2022)

The post is about the White test of heteroscedasticity.

One important assumption of Regression is that the variance of the Error Term is constant across observations. If the error has a constant variance, then the errors are called homoscedastic, otherwise heteroscedastic. In the case of heteroscedastic errors (non-constant variance), the standard estimation methods become inefficient. Typically, to assess the assumption of homoscedasticity, residuals are plotted.

White test of Heteroscedasticity

White test (Halbert White, 1980) proposed a test that is very similar to that by Breusch-Pagen. The White test of Heteroscedasticity is general because it does not rely on the normality assumptions and it is also easy to implement. Because of the generality of White’s test, it may identify the specification bias too. Both the White test of heteroscedasticity and the Breusch-Pagan test are based on the residuals of the fitted model.

To test the assumption of homoscedasticity, one can use auxiliary regression analysis by regressing the squared residuals from the original model on the set of original regressors, the cross-products of the regressors, and the squared regressors.

The step-by-step procedure for performing the White test of Heteroscedasticity is as follows:

Consider the following Linear Regression Model (assume there are two independent variables)
\[Y_i=\beta_0+\beta_1X_{1i}+\beta_1X_{2i}+e_i \tag{1} \]

For the given data, estimate the regression model, and obtain the residuals $e_i$’s.

Note that the regression of residuals can take linear or non-linear functional forms.

  1. Now run the following regression model to obtain squared residuals from original regression on the original set of the independent variable, the square value of independent variables, and the cross-product(s) of the independent variable(s) such as
    \[Y_i=\beta_0+\beta_1X_1+\beta_2X_2+\beta_3X_1^2+\beta_4X_2^2+\beta_5X_1X_2 \tag{2}\]
  2. Find the $R^2$ statistics from the auxiliary regression in step 2.
    You can also use the higher power regressors such as the cube. Also, note that there will be a constant term in equation (2) even though the original regression model (1)may or may not have the constant term.
  3. Test the statistical significance of \[n \times R^2\sim\chi^2_{df}\tag{3},\] under the null hypothesis of homoscedasticity or no heteroscedasticity, where df is the number of regressors in equation (2)
  4. If the calculated chi-square value obtained in (3) is greater than the critical chi-square value at the chosen level of significance, reject the hypothesis of homoscedasticity in favor of heteroscedasticity.
Heteroscedasticity Patterns: White Test of Heteroscedasticity

For several independent variables (regressors) model, introducing all the regressors, their square or higher terms, and their cross products, consume degrees of freedom.

In cases where the White test statistics are statistically significant, heteroscedasticity may not necessarily be the cause, but specification errors. In other words, “The white test can be a test of heteroscedasticity or specification error or both”. If no cross-product terms are introduced in the White test procedure, then this is a pure test of pure heteroscedasticity.
If the cross-product is introduced in the model, then it is a test of both heteroscedasticity and specification bias.

White Test of Heteroscedasticity Detection

By employing the White test of heteroscedasticity, one can gain valuable insights about the presence of heteroscedasticity and decide on appropriate corrective measures (like Weighted Least Squares (WLS)) if necessary to ensure reliable standard errors and hypothesis tests in your regression analysis.

Summary

The White test of heteroscedasticity is a flexible approach that can be used to detect various patterns of heteroscedasticity. This test indicates the presence of heteroscedasticity but it does not pinpoint the specific cause (like model misspecification). The White test is relatively easy to implement in statistical software.

References

  • H. White (1980), “A heteroscedasticity Consistent Covariance Matrix Estimator and a Direct Test of Heteroscedasticity”, Econometrica, Vol. 48, pp. 817-818.
  • https://en.wikipedia.org/wiki/White_test

Click Links to learn more about Tests of Heteroscedasticity: Regression Residuals Plot, Bruesch-Pagan Test, Goldfeld-Quandt Test

See the Numerical Example of the White Test of Heteroscedasticity

Visit: https://gmstat.com

Important Online Estimation Quiz 7

Online Estimation Quiz from Statistical Inference covers the topics of Estimation and Hypothesis Testing for the preparation of exams and different statistical job tests in Government/ Semi-Government or Private Organization sectors. These tests are also helpful in getting admission to different colleges and Universities. The online MCQS estimation quiz will help the learner understand the related concepts and enhance their knowledge.

MCQs about statistical inference covering the topics estimation, estimator, point estimate, interval estimate, properties of a good estimator, unbiasedness, efficiency, sufficiency, Large sample, and sample estimation.

1. The consistency of an estimator can be checked by comparing

 
 
 
 

2. A confidence interval will be widened if:

 
 
 
 

3. Which is NOT the property of a point estimator?

 
 
 
 

4. For $\alpha=0.05$, the critical value of $Z_{0.05}$ is equal to

 
 
 
 

5. Interval estimation and confidence interval are:

 
 
 
 

6. A statistician calculates a 95% confidence interval for $\mu$ when $\sigma$ is known. The confidence interval is Rs 18000 to 22000, and then amount of sample means $\overline{X}$ is:

 
 
 
 

7. In applying t-test

 
 
 
 

8. A large sample contains more than

 
 
 
 

9. The best estimator of population proportion ($\pi$) is:

 
 
 
 

10. The width of the confidence interval decreases if the confidence coefficient is

 
 
 
 

11. Criteria to check a point estimator to be good are

 
 
 
 

12. In a $Z$-test the number of degrees of freedom is

 
 
 
 

13. t-distribution is used when

 
 
 
 

14. A sample is considered a small sample if the size is

 
 
 
 

15. By decreasing $\overline{X}$ the length of the confidence interval for $\mu$

 
 
 
 

16. If the population Standard Deviation is unknown and the sample size is less than 30, then the Confidence Interval for the population mean ($\mu$) is

 
 
 
 

17. For a biased estimator $\hat{\theta}$ of $\theta$, which one of the following is correct.

 
 
 
 

18. If $1-\alpha=0.90$ then value of $Z_{\frac{\alpha}{2}}$ is

 
 
 
 

19. If $Var(T_2)<Var(T_1)$ then $T_2$ is

 
 
 
 

20. If $\mu=130, \overline{X}=150, \sigma=5$, and $n=10$. What Statistic is appropriate.

 
 
 
 

Statistical inference is a branch of statistics in which we conclude (make some wise decisions) about the population parameter using sample information. Statistical inference can be further divided into the Estimation of the Population Parameters and the Hypothesis Testing.

Estimation is a way of finding the unknown value of the population parameter from the sample information by using an estimator (a statistical formula) to estimate the parameter. One can estimate the population parameter by using two approaches (I) Point Estimation and (ii) Interval Estimation.

Online Estimation Quiz

  • A large sample contains more than
  • A sample is considered a small sample if the size is
  • In applying t-test
  • t-distribution is used when
  • If the population Standard Deviation is unknown and the sample size is less than 30, then the Confidence Interval for the population mean ($\mu$) is
  • If $\mu=130, \overline{X}=150, \sigma=5$, and $n=10$. What Statistic is appropriate?
  • If $1-\alpha=0.90$ then value of $Z_{\frac{\alpha}{2}}$ is
  • For $\alpha=0.05$, the critical value of $Z_{0.05}$ is equal to
  • In a $Z$-test the number of degrees of freedom is
  • The width of the confidence interval decreases if the confidence coefficient is
  • By decreasing $\overline{X}$ the length of the confidence interval for $\mu$
  • A statistician calculates a 95% confidence interval for $\mu$ when $\sigma$ is known. The confidence interval is Rs 18000 to 22000, and then the amount of sample means $\overline{X}$ is:
  • Criteria to check a point estimator to be good are
  • The consistency of an estimator can be checked by comparing
  • If $Var(T_2)<Var(T_1)$ then $T_2$ is
  • For a biased estimator $\hat{\theta}$ of $\theta$, which one of the following is correct?
  • Which is NOT the property of a point estimator?
  • The best estimator of population proportion ($\pi$) is:
  • Interval estimation and confidence interval are:
  • A confidence interval will be widened if:

In point estimation, a single numerical value is computed for each parameter, while in an interval estimation, a set of values (interval) for the parameter is constructed. The width of the confidence interval depends on the sample size and confidence coefficient. However, it can be decreased by increasing the sample size. The estimator is a formula used to estimate the population parameter by making use of sample information.

Online Estimation Quiz

gmstat.com online MCQs test Website

Job Interview: Best Recently Asked Questions

Following are different Job Interview questions asked in interviews related to Jobs of Statistical Officer, Data Analyst, Lecturer in Statistics, Enumerator, etc. These recently asked questions are also useful for job interviews related to different disciplines.

The recently asked questions are:

  1. Job Description of an SO (Statistical Officer)?
  2. What types of data are collected by PBS (Pakistan Bureau of Statistics)?
  3. What is the role of the Chief Statistician?
  4. How the reliability of data is checked by the SO in the field?
  5. Problems, faced during the survey?
  6. What is Electoral Sampling?
  7. What is Price Data Collection?
  8. What are the methods for Price Data collection?
  9. What is GDP?
  10. What is GNP?
  11. How many Census has been there in Pakistan?
  12. Difference between Classification and Tabulation.
  13. What is the difference between an Error and a Mistake?
  14. What is Reliability?
  15. What are the Steps of Data Analysis?
  16. Types of Sampling?
  17. Difference between a Diagram and a Graph?
  18. Difference between a Graph and a Chart?
  19. Popular Surveys of Pakistan?
  20. Problems of Surveys?
  21. Difference between a Census and a Sample Survey?
  22. What is the Population Census Organization (PCO)?
  23. Which surveys are being done by the PBS?
  24. Difference between Enumeration and Census?
  25. Why do we use FPC in Sampling without replacement?
  26. Elaborate on the difference between SPSS, Minitab, and E-views.
  27. What is the Path Analysis?
  28. What is the Variable Constraint?
  29. Difference between the Main Effect and the Interaction Effect?
  30. What is the Cooked Data?
  31. What is the difference between Correlation and Covariance?
  32. Which is a type of sampling used for Career counseling?
  33. The unemployment rate in Pakistan?
  34. What is cloud sourcing…related to data collection?
  35. What was the previous name of PBS?
  36. When did this PBS come into being? Under which law?
  37. Who was the last Mughal Emperor?
  38. What is the Shimla Agreement?
  39. SIM stands for what?
  40. ATM stands for what?
  41. Poverty and Employment rates.
  42. What is Inflation?
  43. What is a Stock Market?
  44. Constitutions of Pakistan with dates of approval?
  45. What is a Corporate Sector?
  46. What are KIBOR and LIBOR?
  47. Last agriculture census?
  48. Who is the Secretary of Planning & Development?
  49. History of the Mughal Empire?
  50. Difference between Fiscal and monetary policy?
  51. No. of countries enlisted in the United Nations?
  52. What does the Constitution of Pakistan say about the census?
  53. What was Tehreek-e-Pakistan? Who were its members?
  54. Why did Quad-i-Azam leave the Congress?
  55. The 14 points of the Quaid?
  56. Who is the Minister of Planning?
  57. What was the Sindh-Taas Agreement?
  58. What is the American-Iranian issue?
  59. What is the Invariance Property?
  60. What is COVID-19?
  61. What is the 18th Amendment of the Constitution of Pakistan?
  62. How can you define extreme poverty?
Job Interview: Recently Asked Questions

You can answer and ask more questions in the comment section related to the job interview: Recently Asked Questions.

Job Interview Questions

Computer MCQs Test Online

R and Data Analysis

MCQs Basic Statistics Questions 12

This quiz contains MCQs Basic Statistics Questions with answers covering variable and type of variable, Measures of central tendency such as mean, median, mode, Weighted mean, data and type of data, sources of data, Measure of Dispersion/ Variation, Standard Deviation, Variance, Range, etc. Let us start the MCQs Basic Statistics Questions Quiz.

Please go to MCQs Basic Statistics Questions 12 to view the test

If you found that any POSTED MCQ is/ are WRONG
PLEASE COMMENT below the MCQ with the CORRECT ANSWER and its DETAILED EXPLANATION.

Don’t forget to mention the MCQs Statement (or Screenshot), because MCQs and their answers are generated randomly

The field of statistics deals with the measures of central tendency (such as mean, median, mode, weighted mean, geometric mean, and Harmonic mean) and measures of dispersions (such as range, standard deviation, and variances).

The Basic statistical methods include planning and designing the study, collecting data, arranging, and numerical and graphically summarizing the collected data.

Basic statistics are also used to perform statistical analysis to draw meaningful inferences.

MCQs Basic Statistics Questions

A basic visual inspection of data using some graphical and numerical statistics may give some useful hidden information already available in the data. The graphical representation includes a bar chart, pie chart, dot chart, box plot, etc.

Companies related to finance, communication, manufacturing, charity organizations, government institutes, simple to large businesses, etc. are all examples that have a massive interest in collecting data and measuring different sorts of statistical findings. This helps them to learn from the past, noticing the trends, and planning for the future.

MCQs Basic Statistics Questions

  • Which of the following divides a group of data into four subgroups?
  • Which of the following is not a measure of central tendency?
  • Consider the following grouped data. What is the sample variance of these data?
    Frequency Distribution
  • Suppose data are normally distributed with a mean of 120 and a standard deviation of 30. Between what two values will approximately 68% of the data fall?
  • The interquartile range is which of the following?
  • According to Chebyshev’s theorem, if a population has a mean of 80 and a standard deviation of 35, at least what proportion of the values lie between 30 and 130?
  • If a distribution is abnormally tall and peaked, then it can be said that the distribution is:
  • According to the empirical rule, approximately what percent of the data should lie within $\mu \pm 2\sigma$?
  • Which of the following describes the middle part of a group of numbers?
  • The mean of a distribution is 23, the median is 24, and the mode is 25.5. It is most likely that this distribution is:
  • Which dispersion is used to compare the variation of two series?
  • Which of the following measures does not divide a set of observations into equal parts?
  • Which of the following is written at the top of the table?
  • If a Curve has a longer tail to the right, it is called
  • Which one of the following is the class frequency?
  • If the mean of the two observations is 10.5, then the median of these two observations will be:
  • Which one is the formula of mid-range?
  • Which one of the following is not included in measures of central tendency?
  • For the given data $2, 3, 7, 0$, and $-8$, the G.M. will be:
  • The height of a student is 60 inches. This is an example of ———-?
  • Which branch of statistics deals with the techniques that are used to organize, summarize, and present the data:
  • The sum of squares of deviations from the mean is:
  • Which of the following is not based on all the observations?
  • Data in the Population Census Report is:
  • Which one is the not measure of dispersion?
  • In a positively skewed curve which relation is ———-.

Learn R Programming Basics