Bias in Statistics is defined as the difference between the expected value of a statistic and the true value of the corresponding parameter. Therefore, the bias is a measure of the systematic error of an estimator. The bias indicates the distance of the estimator from the true value of the parameter. For example, if we calculate the mean of a large number of unbiased estimators, we will find the correct value.
In other words, the bias (sampling error) is a systematic error in measurement or sampling and it tells how far off on the average the model is from the truth.
Gauss, C.F. (1821) during his work on the least-squares method gave the concept of an unbiased estimator.
The bias of an estimator of a parameter should not be confused with its degree of precision as the degree of precision is a measure of the sampling error. The bias is favoring of one group or outcome intentionally or unintentionally over other groups or outcomes available in the population under study. Unlike random errors, bias is a serious problem and bias can be reduced by increasing the sample size and averaging the outcomes.
There are several types of bias that should not be considered mutually exclusive
- Selection Bias (arise due to systematic differences between the groups compared)
- Exclusion Bias (arise due to the systematic exclusion of certain individuals from the study)
- Analytical Bias (arise due to the way that the results are evaluated)
Mathematically Bias can be defined as
Let statistics $T$ used to estimate a parameter $\theta$ if $E(T)=\theta+bias(\theta)$ then $bias(\theta)$ is called the bias of the statistic $T$, where $E(T)$ represents the expected value of the statistics $T$.
Note: that if $bias(\theta)=0$, then $E(T)=\theta$. So, $T$ is an unbiased estimator of the true parameter, say $θ$.
Gauss, C.F. (1821, 1823, 1826). Theoria Combinations Observationum Erroribus Minimis Obnoxiae, Parts 1, 2 and suppl. Werke 4, 1-108.