Introduction (Properties of a Good Estimator)
The post is about a comprehensive discussion of the Properties of a Good Estimator. In statistics, an estimator is a function of sample data used to estimate an unknown population parameter. A good estimator is both efficient and unbiased. An estimator is considered as a good estimator if it satisfies the following properties:
- Unbiasedness
- Consistency
- Efficiency
- Sufficiency
- Invariance
Let us discuss these properties of a good estimator one by one.
Unbiasedness
An estimator is said to be an unbiased estimator if its expected value (that is mean of its sampling distribution) is equal to its true population parameter value. Let $\hat{\theta}$ be an unbiased estimator of its true population parameter $\theta$ then $\hat{\theta}$. If $E(\hat{\theta}) = E(\theta)$ the estimator ($\hat{\theta}$) will be unbiased. If $E(\hat{\theta})\ne \theta$, then $\hat{\theta}$ will be a biased estimator of $\theta$.
- If $E(\hat{\theta}) > \theta$, then $\hat{\theta}$ will be positively biased.
- If $E(\hat{\theta}) < \theta$, then $\hat{\theta}$ will be negatively biased.
Some examples of biased or unbiased estimators are:
- $\overline{X}$ is an unbiased estimator of $\mu$, that is, $E(\overline{X}) = \mu$
- $\widetilde{X}$ is also an unbiased estimator when the population is normally distributed, that is, $E(\widetilde{X}) =\mu$
- Sample variance $S^2$ is biased estimator of $\sigma^2$, that is, $E(S^2)\ne \sigma^2$
- $\hat{p} = \frac{x}{n}$ is an unbiased estimator of $E(\hat{p})=p$
It means that if the sampling process is repeated many times and calculations about the estimator for each sample are made, the average of these estimates would be very close to the true population parameter.
An unbiased estimator does not systematically overestimate or underestimate the true parameter.
Consistency
An estimator is said to be a consistent estimator if the statistic to be used as an estimator approaches the true population parameter value by increasing the sample size. OR
An estimator $\hat{\theta}$ is called a consistent estimator of $\theta$ if the probability that $\hat{\theta}$ becomes closer and closer to $\theta$, approaches unity with increasing the sample size.
Symbolically, $\hat{\theta}$ is a consistent estimator of the parameter $\theta$ if for any arbitrary small positive quantity $e$ or $\epsilon$.
\begin{align*}
\lim\limits_{n\rightarrow \infty} P\left[|\hat{\theta}-\theta|\le \varepsilon\right] &= 1\\
\lim\limits_{n\rightarrow \infty} P\left[|\hat{\theta}-\theta|> \varepsilon\right] &= 0
\end{align*}
A consistent estimator may or may not be unbiased. The sample mean $\overline{X}=\frac{\Sigma X_i}{n}$ and sample proportion $\hat{p} = \frac{x}{n}$ are unbiased estimators of $\mu$ and $p$, respectively and are also consistent.
It means that as one collects more and more data, the estimator becomes more and more accurate in approximating the true population value.
An efficient estimator is less likely to produce extreme values, making it more reliable.
Efficiency
An unbiased estimator is said to be efficient if the variance of its sampling distribution is smaller than that of the sampling distribution of any other unbiased estimator of the same parameter. Suppose there are two unbiased estimators $T_1$ and $T_2$ of the sample parameter $\theta$, then $T_1$ will be said to be a more efficient estimator compared to the $T_2$ if $Var(T_1) < Var(T_2)$. The relative efficiency of $T_1$ compared to $T_2$ is given by the ration
$$E = \frac{Var(T_2)}{Var(T_1)} > 1$$
Note that when two estimators are biased then MSE is used to compare.
A more efficient estimator has a smaller sampling error, meaning it is less likely to deviate significantly from the true population parameter.
An efficient estimator is less likely to produce extreme values, making it more reliable.
Sufficiency
An estimator is said to be sufficient if the statistic used as an estimator utilizes all the information contained in the sample. Any statistic that is not computed from all values in the sample is not a sufficient estimator. The sample mean $\overline{X}=\frac{\Sigma X}{n}$ and sample proportion $\hat{p} = \frac{x}{n}$ are sufficient estimators of the population mean $\mu$ and population proportion $p$, respectively but the median is not a sufficient estimator because it does not use all the information contained in the sample.
A sufficient estimator provides us with maximum information as it is close to a population which is why, it also measures variability.
A sufficient estimator captures all the useful information from the data without any loss.
A sufficient estimator captures all the useful information from the data.
Invariance (Property of Love)
If the function of the parameter changes, the estimator also changes with some functional applications. This property is known as invariance.
\begin{align}
E(X-\mu)^2 &= \sigma^2 \\
\text{or } \sqrt{E(X-\mu)^2} &= \sigma\\
\text{or } [E(X-\mu)^2]^2 &= (\sigma^2)^2
\end{align}
The property states that if $\hat{\theta}$ is the MLE of $\theta$ then $\tau(\hat{\theta})$ is the MLE of $\tau(\hat{\theta})$ for any function. The Taw ($\tau$) is the general form of any function. for example $\theta=\overline{X}$, $\theta^2=\overline{X}^2$, and $\sqrt{\theta}=\sqrt{\overline{X}}$.
From the above diagrammatic representations, one can visualize the properties of a good estimator as described below.
- Unbiasedness: The estimator should be centered around the true value.
- Efficiency: The estimator should have a smaller spread (variance) around the true value.
- Consistency: As the sample size increases, the estimator should become more accurate.
- Sufficiency: The estimator should capture all relevant information from the sample.
In summary, regarding the properties of a good estimator, a good estimator is unbiased, efficient, consistent, and ideally sufficient. It should also be robust to outliers and have a low MSE.
https://rfaqs.com, https://gmstat.com