## Regression Model Assumptions

#### Linear Regression Model Assumptions

The linear regression model (LRM) is based on certain statistical assumptions, some of which are related to the distribution of a random variable (error term) $\mu_i$, some are about the relationship between error term $\mu_i$ and the explanatory variables (Independent variables, *X’s*) and some are related to the independent variable themselves. The linear regression model assumptions can be classified into two categories

- Stochastic Assumption
- None Stochastic Assumptions

These linear regression model assumptions (or assumptions about the ordinary least square method: OLS) are extremely critical to interpreting the regression coefficients.

- The error term ($\mu_i$) is a random real number i.e. $\mu_i$ may assume any positive, negative, or zero value upon chance. Each value has a certain probability, therefore, the error term is a random variable.
- The mean value of $\mu$ is zero, i.e. $E(\mu_i)=0$ i.e. the mean value of $\mu_i$ is conditional upon the given $X_i$ is zero. It means that for each value of variable $X_i$
*,*$\mu$ - The variance of $\mu_i$ is constant i.e. for the given value of
*X*, the variance of $\mu_i$ is the same for all observations. $E(\mu_i^2)=\sigma^2$. The variance of disturbance term ($\mu_i$) about its mean is at all values of*X*will show the same dispersion about their mean. - The variable $\mu_i$ has a normal distribution i.e. $\mu_i\sim N(0,\sigma_{\mu}^2$. The value of $\mu$ (for each $X_i$) has a bell-shaped symmetrical distribution.
- The random term of different observation ($\mu_i,\mu_j$) are independent i..e $E(\mu_i,\mu_j)=0$, i.e. there is no autocorrelation between the disturbances. It means that the random term assumed in one period does not depend on the values in any other period.
- $\mu_i$ and $X_i$ have zero covariance between them i.e. $\mu$ is independent of the explanatory variable or $E(\mu_i X_i)=0$ i.e. $Cov(\mu_i, X_i)=0$. The disturbance term $\mu$ and explanatory variable
*X*are uncorrelated. The $\mu$’s and $X$’s do not tend to vary together as their covariance is zero. This assumption is automatically fulfilled if*X*variable is nonrandom or non-stochastic or if the mean of the random term is zero. - All the explanatory variables are measured without error. It means that we will assume that the regressors are error-free while
*y*(dependent variable) may or may not include errors in measurements. - The number of observations
*n*must be greater than the number of parameters to be estimated or the number of observations must be greater than the number of explanatory (independent) variables. - The should be variability in the
*X*values. That is*X*values in a given sample must not be the same. Statistically, $Var(X)$ must be a finite positive number. - The regression model must be correctly specified, meaning there is no specification bias or error in the model used in empirical analysis.
- There is no perfect or near-perfect multicollinearity or collinearity among the two or more explanatory (independent) variables.
- Values taken by the regressors
*X*are considered to be fixed in repeating sampling i.e.*X*is assumed to be non-stochastic. Regression analysis is conditional on the given values of the regressor(s)*X*. - The linear regression model is linear in the parameters, e.g. $y_i=\beta_1+\beta_2x_i +\mu_i$

Visit MCQs Site: https://gmstat.com