# Correlation Regression

Correlation and Regression Analysis

## Quiz Correlation & Regression

This Post contains Quiz Correlation Regression Analysis, Multiple Regression Analysis, Coefficient of Determination (Explained Variation), Unexplained Variation, Model Selection Criteria, Model Assumptions, Interpretation of results, Intercept, Slope, Partial Correlation, Significance tests, OLS Assumptions, Multicollinearity, Heteroscedasticity, Autocorrelation, graphical representation of the relationship between the variables, etc. Let us start with Quiz Correlation Regression Analysis.

Correlation analysis is a statistical measure used to determine the strength and direction of the mutual relationship between two quantitative variables. The value of the correlation lies between $-1$ and $+1$. The regression analysis describes how an explanatory variable is numerically related to the dependent variables.

The formula to compute the correlation coefficient is:

$$r = \frac{n\sum X_i Y_i – \sum X_i \sum Y_i}{\sqrt{[n\sum X_i^2 – (\sum X_i)^2][n\sum Y_i^2 – (\sum Y_i)^2]}}$$

The general regression equation is $Y_i = a + bX_i$. The slope coefficient and intercept of the regression model can be computed as

\begin{align*} b &= \frac{n\sum X_i Y_i – \sum X_i \sum Y_i}{n\sum X_i^2 – (\sum X_i)^2}\\ a &= \overline{Y} – b\overline{X} \end{align*}

Both of the tools are used to represent the linear relationship between the two quantitative variables. The relationship between variables can be observed using a graphical representation between the variables. We can also compute the strength of the relationship between variables by performing numerical calculations using appropriate computational formulas.

Note that neither regression nor correlation analyses can be interpreted as establishing some cause-and-effect relationships. Both correlation and regression are used to indicate how or to what extent the variables under study are associated (or mutually related) with each other. The correlation coefficient measures only the degree (strength) and direction of linear association between the two variables. Any conclusions about a cause-and-effect relationship must be based on the judgment of the analyst.

## The Spearman Rank Correlation Test (Numerical Example)

Consider the following data for the illustration of the detection of heteroscedasticity using the Spearman Rank correlation test. The Data file is available to download.

The estimated multiple linear regression model is:

$$Y_i = -34.936 -0.75X_{2i} + 7.611X_{3i}$$

The Residuals with the data table are:

We need to find the rank of absolute values of $u_i$ and the expected heteroscedastic variable $X_2$.

### Calculating the Spearman Rank correlation

\begin{align}
r_s&=1-\frac{6\sum d^2}{n(n-1)}\\
&=1-\frac{6\times 70.5)}{10(100-1)}=0.5727
\end{align}

Let us perform the statistical significance of $r_s$ by t-test

\begin{align}
t&=\frac{r_s \sqrt{n}}{\sqrt{1-r_s^2}}\\
&=\frac{0.5727\sqrt{8}}{\sqrt{1-(0.573)^2}}=1.977
\end{align}

The value of $t$ from the table at a 5% level of significance at 8 degrees of freedom is 2.306.

Since $t_{cal} \ngtr t_{tab}$, there is no evidence of the systematic relationship between the explanatory variables, $X_2$ and the absolute value of the residuals ($|u_i|$) and hence there is no evidence of heteroscedasticity.

Since there is more than one regressor (the example is from the multiple regression model), therefore, Spearman’s Rank Correlation test should be repeated for each of the explanatory variables.

As an assignment perform the Spearman Rank Correlation between |$u_i$| and $X_3$  for the data above. Test the statistical significance of the coefficient in the above manner to explore evidence about heteroscedasticity.

## MCQs Econometrics-1

This Post is about MCQs Econometrics, which covers the topics of Regression analysis, correlation, dummy variable, multicollinearity, heteroscedasticity, autocorrelation, and many other topics. Let’s start with MCQs Econometric test

MCQs about Multicollinearity, Dummy Variable, Selection of Variables, Error in Variables, Autocorrelation, Time Series, Heteroscedasticity, Simultaneous Equations, and Regression analysis

1. Autocorrelation may occur due to

2. Which of the following statements is true about autocorrelation?

3. Choose a true statement about Durbin-Watson test

4. If a Durbin Watson statistic takes a value close to zero what will be the value of first-order autocorrelation coefficient

5. Which of the action does not make sense to take in order to struggle against multicollinearity?

6. Which one is not the rule of thumb?

7. Heteroscedasticity can be detected by plotting the estimated $\hat{u}_i^2$ against

8. In a regression model with three explanatory variables, there will be _______ auxiliary regressions

9. Heteroscedasticity is more common in

10. For the presence and absence of first-order autocorrelation valid tests are

11. Negative autocorrelation can be indicated by which of the following?

12. The value of Durbin Watson $d$ lie between

13. The AR(1) process is stationary if

14. Which one assumption is not related to error in explanatory variables?

15. When measurement errors are present in the explanatory variable(s) they make

An application of different statistical methods applied to the economic data used to find empirical relationships between economic data is called Econometrics. In other words, Econometrics is “the quantitative analysis of actual economic phenomena based on the concurrent development of theory and observation, related by appropriate methods of inference”.

Scroll to Top