Important MCQs on Chi-Square Test Quiz – 3

The post is about Online MCQs on Chi-Square Test Quiz with Answers. The Quiz MCQs on Chi-Square Test cover the topics of attributes, Chi-Square Distribution, Coefficient of Association, Contingency Table, and Hypothesis Testing on Association between attributes, etc. Let us start with MCQs on Chi-Square Test Quiz.

The quiz about Chi-Square Association between attributes.

1. If $(AB) < \frac{(A)(B)}{n}$ then association between two attributes $A$ and $B$ is

 
 
 
 

2. The value of $\chi^2$ cannot be ———.

 
 
 
 

3. The process of dividing the objects into two mutually exclusive classes is called

 
 
 
 

4. A contingency table with $r$ rows and $c$ columns is called

 
 
 
 

5. Two attributes $A$ and $B$ are said to be independent if

 
 
 
 

6. If $(AB) > \frac{(A)(B)}{n}$ then association is

 
 
 
 

7. There are ———– parameters of Chi-Square distribution.

 
 
 
 

8. The range of $\chi^2$ is

 
 
 
 

9. The coefficient of association $Q$ lies between

 
 
 
 

10. Two attributes $A$ and $B$ are said to be positively associated if

 
 
 
 

11. The presence of an attribute is denoted by

 
 
 
 

12. A characteristic which varies in quality from one individual to another is called

 
 
 
 

13. The eye colour of 100 men is

 
 
 
 

14. If $\chi^2_c=5.8$ and $df=1$, we make the following decision ———-.

 
 
 
 

15. Association measures the strength of the relationship between

 
 
 
 

16. A $4 \times 5$ contingency table consists of ———.

 
 
 
 

17. For $r\times c$ contingency table, the Chi-Square test has $df=$ ———-.

 
 
 
 

18. For the $3\times 3$ contingency table, the degrees of freedom is

 
 
 
 

19. If for a contingency table, $df=12$ and the number of rows is 4 then the number of columns will be

 
 
 
 

20. The parameter of the Chi-Square distribution is ———–.

 
 
 
 

The relationship/ dependency between the attributes is called association and the measure of degrees of relationship between the attributes is called the coefficient of association. The Chi-Square Statistic is used to test the association between the attributes. The Chi-Square Association is defined as

$$\chi^2 = \sum \frac{(of_i – ef_i)^2}{ef_i}\sim \chi^2_{v},$$

where $v$ denotes the degrees of freedom

MCQs on Chi-Square Test quiz

A population can be divided into two or more mutually exclusive and exhaustive classes according to their characteristics. It is called dichotomy or twofold division if, it is divided into two mutually exclusive classes. A contingency table is a two-way table in which the data is classified according to two attributes, each having two or more levels. A measure of the degree of association between attributes expressed in a contingency table is known as the coefficient of contingency. Pearson’s mean square coefficient of contingency is

\[C=\sqrt{\frac{\chi^2}{n+\chi^2}}\]

MCQs on Chi-Square Test Quiz with Answers

  • A characteristic which varies in quality from one individual to another is called
  • The eye colour of 100 men is
  • Association measures the strength of the relationship between
  • The presence of an attribute is denoted by
  • The process of dividing the objects into two mutually exclusive classes is called
  • There are ———– parameters of Chi-Square distribution.
  • The parameter of the Chi-Square distribution is ———–.
  • The value of $\chi^2$ cannot be ———.
  • The range of $\chi^2$ is
  • Two attributes $A$ and $B$ are said to be independent if
  • Two attributes $A$ and $B$ are said to be positively associated if
  • If $(AB) > \frac{(A)(B)}{n}$ then association is
  • If $(AB) < \frac{(A)(B)}{n}$ then association between two attributes $A$ and $B$ is
  • The coefficient of association $Q$ lies between
  • If $\chi^2_c=5.8$ and $df=1$, we make the following decision ———-.
  • A contingency table with $r$ rows and $c$ columns is called
  • A $4 \times 5$ contingency table consists of ———.
  • If for a contingency table, $df=12$ and the number of rows is 4 then the number of columns will be
  • For $r\times c$ contingency table, the Chi-Square test has $df=$ ———-.
  • For the $3\times 3$ contingency table, the degrees of freedom is

Attributes are said to be independent if there is no association between them. Independence means the presence or absence of one attribute does not affect the other. The association is positive if the observed frequency of attributes is greater than the expected frequency and negative association or disassociation (negative association) is if the observed frequency is less than the expected frequency.

Important MCQs on Chi-Square Test Quiz - 3

Non-Parametric Tests Quiz: MCQs Non-Parametric

R Language Frequently Asked Questions

Statistics and Data Analysis

Generalized Least Squares (GLS vs OLS) (2022)

The usual Ordinary Least Squares (OLS) method assigns equal weight (or importance) to each observation. But generalized least squares (GLS) take such information into account explicitly and are therefore capable of producing BLUE estimators. Both GLS and OLS are regression techniques used to fit a line to data points and estimate the relationship between a dependent variable ($y$) and one or more independent variables ($X$).

Consider following two-variable model,

begin{align}
Y_i &= beta_1 + beta_2 X_i + u_inonumber\
text{or}\
Y_i &= beta_1X_{0i} + beta_2 X_i + u_i, tag*{(eq1)}
end{align}

where $X_{0i}=1$ for each $i$.

Generalized Least Squares (GLS)

Assume that the heteroscedastic variance $sigma_i^2$ is known:

begin{align}
frac{Y_i}{sigma_i} &= beta_1 left(frac{X_{0i}}{sigma_i} right)+beta_2 left(frac{X_i}{sigma_i}right) +left(frac{u_i}{sigma_i}right)\nonumber
Y_i^* &= beta_i^* X_{0i}^* + beta_2^* X_i^* + u_i^*, tag*{(eq2)}
end{align}

where the stared variables (variables with stars on them) are the original variable divided by the known $sigma_i$. The stared coefficients are the transformed model’s parameters, distinguishing them from OLS parameters $beta_1$ and $beta_2$.

begin{align*}
Var(u_i^*) &=E(u_i^{2*})=Eleft(frac{u_i}{sigma_i}right)^2\
&=frac{1}{sigma_i^2}E(u_i^2) tag*{$because E(u_i)=0$}\
&=frac{1}{sigma_i^2}sigma_i^2 tag*{$because E(u_i^2)=sigma_i^2$}=1, text{which is a constant.}
end{align*}

The variance of the transformed $u_i^*$ is now homoscedastic. Applying OLS to the transformed model (eq2) will produce estimators that are BLUE, that is, $beta_1^*$ and $beta_2^*$ are now BLUE while $hat{beta}_1$ and $hat{beta}_2$ not.

Generalized Least Squares (GLS) Method

The procedure of transforming the original variable in such a way that the transformed variables satisfy the assumption of the classical model and then applying OLS to them is known as the Generalized Least Squares (GLS) method.

The Generalized Least Squares (GLS) are Ordinary Least squares (OLS) on the transformed variables that satisfy the standard LS assumptions. The estimators obtained are known as GLS estimators and are BLUE.

To obtain a Generalized Least Squares Estimator we minimize

begin{align}
sum hat{u}_i^{*2} &= sum left(Y_i^* =hat{beta}_1^* X_{0i}^* – hat{beta}_2^* X_i^* right)^2nonumber\
text{That is}\
sum left(frac{hat{u}_i}{sigma_i}right)^2 &=sum left[frac{Y_i}{sigma_i} – hat{beta}_1^* left(frac{X_{0i}}{sigma_i}right) -hat{beta}_2^*left(frac{X_i}{sigma_i}right) right]^2 tag*{(eq3)}\
sum w_i hat{u}_i^2 &=sum w_i(Y_i-hat{beta}_1^* X_{0i} -hat{beta}_2^*X_i)^2 tag*{(eq4)}
end{align}

The GLS estimator of $hat{beta}_2^*$ is

begin{align*}
hat{beta}_2^* &= frac{(sum w_i)(sum w_i X_iY_i)-(sum w_i X_i)(sum w_iY_i) }{(sum w_i)(sum w_iX_i^2)-(sum w_iX_i)^2} \
Var(hat{beta}_2^*) &=frac{sum w_i}{(sum w_i)(sum w_iX_i^2)-(sum w_iX_i)^2},\
text{where $w_i=frac{1}{sigma_i^2}$}
end{align*}

Difference between GLS and OLS

In GLS, a weighted sum of residual squares is minimized with $w_i=frac{1}{sigma}_i^2$ acting as the weights, but in OLS an unweighted (or equally weighted residual sum of squares) is minimized. From equation (eq3), in GLS the weight assigned to each observation is inversely proportional to its $sigma_i$, that is, observations coming from a population with larger $sigma_i$ will get relatively smaller weight, and those from a population with $sigma_i$ will get proportionately larger weight in minimizing the RSS (eq4).

Since equation (eq4) minimized a weighted RSS, it is known as weighted least squares (WLS), and the estimators obtained are known as WLS estimators.

GLS Method

The generalized Least Squares method is a powerful tool for handling correlated and heteroscedastic errors. This method is also widely used in econometrics, finance, and other fields where regression analysis is applied to real-world data with complex error structures.

The summary of key differences between GLS and OLS methods are

FeatureGLS MethodOLS Method
AssumptionsCan handle Heteroscedasticity, AutocorrelationHomoscedasticity, and Error Term Independent
MethodMinimizes weighted sum of Squares of residualsMinimzes the sum of squares of residuals
BenefitsMore efficient estimates (if assumptions are met)Simpler to implement
DrawbacksMore complex, requires error covariance matrix estimationCan be inefficient when assumptions are violated

Remember, diagnosing issues (violation of assumptions) like heteroscedasticity and autocorrelation are often performed after an initial OLS fit. This can help to decide if GLS or other robust regression techniques are necessary. Therefore, the choice among OLS and GLS depends on the data characteristics and sample size.

Read about the Residual Plot

R and Data Analysis

MCQs English

White Test of Heteroscedasticity Detection (2022)

The post is about the White test of heteroscedasticity.

One important assumption of Regression is that the variance of the Error Term is constant across observations. If the error has a constant variance, then the errors are called homoscedastic, otherwise heteroscedastic. In the case of heteroscedastic errors (non-constant variance), the standard estimation methods become inefficient. Typically, to assess the assumption of homoscedasticity, residuals are plotted.

White test of Heteroscedasticity

White test (Halbert White, 1980) proposed a test that is very similar to that by Breusch-Pagen. The White test of Heteroscedasticity is general because it does not rely on the normality assumptions and it is also easy to implement. Because of the generality of White’s test, it may identify the specification bias too. Both the White test of heteroscedasticity and the Breusch-Pagan test are based on the residuals of the fitted model.

To test the assumption of homoscedasticity, one can use auxiliary regression analysis by regressing the squared residuals from the original model on the set of original regressors, the cross-products of the regressors, and the squared regressors.

The step-by-step procedure for performing the White test of Heteroscedasticity is as follows:

Consider the following Linear Regression Model (assume there are two independent variables)
\[Y_i=\beta_0+\beta_1X_{1i}+\beta_1X_{2i}+e_i \tag{1} \]

For the given data, estimate the regression model, and obtain the residuals $e_i$’s.

Note that the regression of residuals can take linear or non-linear functional forms.

  1. Now run the following regression model to obtain squared residuals from original regression on the original set of the independent variable, the square value of independent variables, and the cross-product(s) of the independent variable(s) such as
    \[Y_i=\beta_0+\beta_1X_1+\beta_2X_2+\beta_3X_1^2+\beta_4X_2^2+\beta_5X_1X_2 \tag{2}\]
  2. Find the $R^2$ statistics from the auxiliary regression in step 2.
    You can also use the higher power regressors such as the cube. Also, note that there will be a constant term in equation (2) even though the original regression model (1)may or may not have the constant term.
  3. Test the statistical significance of \[n \times R^2\sim\chi^2_{df}\tag{3},\] under the null hypothesis of homoscedasticity or no heteroscedasticity, where df is the number of regressors in equation (2)
  4. If the calculated chi-square value obtained in (3) is greater than the critical chi-square value at the chosen level of significance, reject the hypothesis of homoscedasticity in favor of heteroscedasticity.
Heteroscedasticity Patterns: White Test of Heteroscedasticity

For several independent variables (regressors) model, introducing all the regressors, their square or higher terms, and their cross products, consume degrees of freedom.

In cases where the White test statistics are statistically significant, heteroscedasticity may not necessarily be the cause, but specification errors. In other words, “The white test can be a test of heteroscedasticity or specification error or both”. If no cross-product terms are introduced in the White test procedure, then this is a pure test of pure heteroscedasticity.
If the cross-product is introduced in the model, then it is a test of both heteroscedasticity and specification bias.

White Test of Heteroscedasticity Detection

By employing the White test of heteroscedasticity, one can gain valuable insights about the presence of heteroscedasticity and decide on appropriate corrective measures (like Weighted Least Squares (WLS)) if necessary to ensure reliable standard errors and hypothesis tests in your regression analysis.

Summary

The White test of heteroscedasticity is a flexible approach that can be used to detect various patterns of heteroscedasticity. This test indicates the presence of heteroscedasticity but it does not pinpoint the specific cause (like model misspecification). The White test is relatively easy to implement in statistical software.

References

  • H. White (1980), “A heteroscedasticity Consistent Covariance Matrix Estimator and a Direct Test of Heteroscedasticity”, Econometrica, Vol. 48, pp. 817-818.
  • https://en.wikipedia.org/wiki/White_test

Click Links to learn more about Tests of Heteroscedasticity: Regression Residuals Plot, Bruesch-Pagan Test, Goldfeld-Quandt Test

See the Numerical Example of the White Test of Heteroscedasticity

Visit: https://gmstat.com