## Multicollinearity

For a classical linear regression model with multiple regressors (explanatory variables), there should be no exact linear relationship between the explanatory variables. The collinearity or multicollinearity term is used if there is/are one or more linear relationship exists among the variables.

The term multicollinearity is considered as the violation of the assumption of “no exact linear relationship between the regressors.

Ragnar Frisch introduced this term, originally it means the existence of a “perfect” or “exact” linear relationship among some or all regressors of a regression model.

Consider a $k$-variable regression model involving explanatory variables $X_1, X_2, \cdots, X_k$. An exact linear relationship is said to exist if the following condition is satisfied.

\[\lambda_1 X_1 + \lambda_2 X_2 + \cdots + \lambda_k X_k=0,\]

where $\lambda_1, \lambda_2, \cdots, \lambda_k$ are constant and all of them all are non-zero, simultaneously, and $X_1=1$ for all observations for intercept term.

Now a day, multicollinearity term is not only being used for the case of perfect multicollinearity but also in case of not perfect collinearity (the case where the $X$ variables are intercorrelated but not perfectly). Therefore,

\[\lambda_1X_1 + \lambda_2X_2 + \cdots \lambda_kX_k + \upsilon_i,\]

where $\upsilon_i$ is a stochastic error term.

In case of a perfect linear relationship (correlation coefficient will be one in this case) among explanatory variables, the parameters become indeterminate (it is impossible to obtain values for each parameter separately) and the method of least square breaks down. However, if regressors are not intercorrelated at all, the variables are called orthogonal and there is no problem concerning the estimation of coefficients.

Note that

- Multicollinearity is not a condition that either exists or does not exist, but rather a phenomenon inherent in most relationships.
- Multicollinearity refers to the only a linear relationship among the $X$ variables. It does not rule out the non-linear relationships among them.

See use of mctest R package for diagnosing collinearity