#### Linear Regression Model Assumptions

The linear regression model (LRM) is based on certain statistical assumptions, some of which are related to the distribution of a random variable (error term) $u_i$, some are about the relationship between error term $u_i$ and the explanatory variables (Independent variables, $X$*‘s*) and some are related to the independent variable themselves. The linear regression model assumptions can be classified into two categories

- Stochastic Assumption
- None Stochastic Assumptions

These linear regression model assumptions (or assumptions about the ordinary least square method: OLS) are extremely critical to interpreting the regression coefficients.

- The error term ($u_i$) is a random real number i.e. $u_i$ may assume any positive, negative, or zero value upon chance. Each value has a certain probability, therefore, the error term is a random variable.
- The mean value of $u$ is zero, i.e. $E(u_i)=0$ i.e. the mean value of $u_i$ is conditional upon the given $X_i$ is zero. It means that for each value of variable $X_i$
*,*$u$ - The variance of $u_i$ is constant i.e. for the given value of $X$, the variance of $u_i$ is the same for all observations. $E(u_i^2)=\sigma^2$. The variance of disturbance term ($u_i$) about its mean is at all values of $X$ will show the same dispersion about their mean.
- The variable $u_i$ has a normal distribution i.e. $u_i\sim N(0,\sigma_{u}^2$. The value of $u$ (for each $X_i$) has a bell-shaped symmetrical distribution.
- The random terms of different observations ($u_i,u_j$) are independent i..e $E(u_i,u_j)=0$, i.e. there is no autocorrelation between the disturbances. It means that the random term assumed in one period does not depend on the values in any other period.
- $u_i$ and $X_i$ have zero covariance between them i.e. $u$ is independent of the explanatory variable or $E(u_i X_i)=0$ i.e. $Cov(u_i, X_i)=0$. The disturbance term $u$ and explanatory variable $X$ are uncorrelated. The $u$’s and $X$’s do not tend to vary together as their covariance is zero. This assumption is automatically fulfilled if the $X$ variable is nonrandom or non-stochastic or if the mean of the random term is zero.
- All the explanatory variables are measured without error. It means that we will assume that the regressors are error-free while $y$ (dependent variable) may or may not include errors in measurements.
- The number of observations $n$ must be greater than the number of parameters to be estimated or the number of observations must be greater than the number of explanatory (independent) variables.
- The should be variability in the $X$ values. That is $X$ values in a given sample must not be the same. Statistically, $Var(X)$ must be a finite positive number.
- The regression model must be correctly specified, meaning there is no specification bias or error in the model used in empirical analysis.
- There is no perfect or near-perfect multicollinearity or collinearity among the two or more explanatory (independent) variables.
- Values taken by the regressors $X$ are considered to be fixed in repeating sampling i.e. $X$ is assumed to be non-stochastic. Regression analysis is conditional on the given values of the regressor(s) $X$.
- The linear regression model is linear in the parameters, e.g. $y_i=\beta_1+\beta_2x_i +u_i$

Visit MCQs Site: https://gmstat.com

i like it