Properties of a Good Estimator

Introduction (Properties of a Good Estimator)

The post is about a comprehensive discussion of the Properties of a Good Estimator. In statistics, an estimator is a function of sample data used to estimate an unknown population parameter. A good estimator is both efficient and unbiased. An estimator is considered as a good estimator if it satisfies the following properties:

  • Unbiasedness
  • Consistency
  • Efficiency
  • Sufficiency
  • Invariance

Let us discuss these properties of a good estimator one by one.

Unbiasedness

An estimator is said to be an unbiased estimator if its expected value (that is mean of its sampling distribution) is equal to its true population parameter value. Let $\hat{\theta}$ be an unbiased estimator of its true population parameter $\theta$ then $\hat{\theta}$. If $E(\hat{\theta}) = E(\theta)$ the estimator ($\hat{\theta}$) will be unbiased. If $E(\hat{\theta})\ne \theta$, then $\hat{\theta}$ will be a biased estimator of $\theta$.

  • If $E(\hat{\theta}) > \theta$, then $\hat{\theta}$ will be positively biased.
  • If $E(\hat{\theta}) < \theta$, then $\hat{\theta}$ will be negatively biased.

Some examples of biased or unbiased estimators are:

  • $\overline{X}$ is an unbiased estimator of $\mu$, that is, $E(\overline{X}) = \mu$
  • $\widetilde{X}$ is also an unbiased estimator when the population is normally distributed, that is, $E(\widetilde{X}) =\mu$
  • Sample variance $S^2$ is biased estimator of $\sigma^2$, that is, $E(S^2)\ne \sigma^2$
  • $\hat{p} = \frac{x}{n}$ is an unbiased estimator of $E(\hat{p})=p$

It means that if the sampling process is repeated many times and calculations about the estimator for each sample are made, the average of these estimates would be very close to the true population parameter.

An unbiased estimator does not systematically overestimate or underestimate the true parameter.

Consistency

An estimator is said to be a consistent estimator if the statistic to be used as an estimator approaches the true population parameter value by increasing the sample size. OR
An estimator $\hat{\theta}$ is called a consistent estimator of $\theta$ if the probability that $\hat{\theta}$ becomes closer and closer to $\theta$, approaches unity with increasing the sample size.

Symbolically, $\hat{\theta}$ is a consistent estimator of the parameter $\theta$ if for any arbitrary small positive quantity $e$ or $\epsilon$.

\begin{align*}
\lim\limits_{n\rightarrow \infty} P\left[|\hat{\theta}-\theta|\le \varepsilon\right] &= 1\\
\lim\limits_{n\rightarrow \infty} P\left[|\hat{\theta}-\theta|> \varepsilon\right] &= 0
\end{align*}

A consistent estimator may or may not be unbiased. The sample mean $\overline{X}=\frac{\Sigma X_i}{n}$ and sample proportion $\hat{p} = \frac{x}{n}$ are unbiased estimators of $\mu$ and $p$, respectively and are also consistent.

It means that as one collects more and more data, the estimator becomes more and more accurate in approximating the true population value.

An efficient estimator is less likely to produce extreme values, making it more reliable.

Efficiency

An unbiased estimator is said to be efficient if the variance of its sampling distribution is smaller than that of the sampling distribution of any other unbiased estimator of the same parameter. Suppose there are two unbiased estimators $T_1$ and $T_2$ of the sample parameter $\theta$, then $T_1$ will be said to be a more efficient estimator compared to the $T_2$ if $Var(T_1) < Var(T_2)$. The relative efficiency of $T_1$ compared to $T_2$ is given by the ration

$$E = \frac{Var(T_2)}{Var(T_1)} > 1$$

Note that when two estimators are biased then MSE is used to compare.

A more efficient estimator has a smaller sampling error, meaning it is less likely to deviate significantly from the true population parameter.

An efficient estimator is less likely to produce extreme values, making it more reliable.

Sufficiency

An estimator is said to be sufficient if the statistic used as an estimator utilizes all the information contained in the sample. Any statistic that is not computed from all values in the sample is not a sufficient estimator. The sample mean $\overline{X}=\frac{\Sigma X}{n}$ and sample proportion $\hat{p} = \frac{x}{n}$ are sufficient estimators of the population mean $\mu$ and population proportion $p$, respectively but the median is not a sufficient estimator because it does not use all the information contained in the sample.

A sufficient estimator provides us with maximum information as it is close to a population which is why, it also measures variability.

A sufficient estimator captures all the useful information from the data without any loss.

A sufficient estimator captures all the useful information from the data.

Invariance (Property of Love)

If the function of the parameter changes, the estimator also changes with some functional applications. This property is known as invariance.

\begin{align}
E(X-\mu)^2 &= \sigma^2 \\
\text{or } \sqrt{E(X-\mu)^2} &= \sigma\\
\text{or } [E(X-\mu)^2]^2 &= (\sigma^2)^2
\end{align}

The property states that if $\hat{\theta}$ is the MLE of $\theta$ then $\tau(\hat{\theta})$ is the MLE of $\tau(\hat{\theta})$ for any function. The Taw ($\tau$) is the general form of any function. for example $\theta=\overline{X}$, $\theta^2=\overline{X}^2$, and $\sqrt{\theta}=\sqrt{\overline{X}}$.

Properties of a Good Estimator

From the above diagrammatic representations, one can visualize the properties of a good estimator as described below.

  • Unbiasedness: The estimator should be centered around the true value.
  • Efficiency: The estimator should have a smaller spread (variance) around the true value.
  • Consistency: As the sample size increases, the estimator should become more accurate.
  • Sufficiency: The estimator should capture all relevant information from the sample.

In summary, regarding the properties of a good estimator, a good estimator is unbiased, efficient, consistent, and ideally sufficient. It should also be robust to outliers and have a low MSE.

Properties of a good estimator

https://rfaqs.com, https://gmstat.com

Best Online Estimation MCQs 1

Online Estimation MCQs for Preparation of PPSC and FPSC Statistics Lecturer Post. There are 20 multiple-choice questions covering the topics related to properties of a good estimation (unbiasedness, efficiency, sufficiency, consistency, and invariance), expectation, point estimate, and interval estimate. Let us start with the Online Estimation MCQs Quiz.

Online MCQs about Estimate and Estimation for Preparation of PPSC and FPSC Statistics Lecturer Post

1. Let $X_1,X_2,\cdots,X_n$ be a random sample from the density $f(x;\theta)$, where $\theta$ may be vector. If the conditional distribution of $X_1,X_2,\cdots,X_n$ given $S=s$ does not depend on $\theta$ for any value of $s$ of $S$, then statistic is called.

 
 
 
 

2. If $X_1,X_2,\cdots, X_n$ is the joint density of n random variables, say, $f(X_1, X_2,\cdots, X_n;\theta)$ which is considered to be a function of $\theta$. Then $L(\theta; X_1,X_2,\cdots, X_n)$ is called

 
 
 
 

3. In statistical inference, the best asymptotically normal estimator is denoted by

 
 
 
 

4. Let $L(\theta;X_1,X_2,\cdots,X_n)$ be the likelihood function for a sample $X_1,X_2,\cdots, X_n$ having joint density $f(x_1,x_2,\cdots,x_n;\theta)$ where ? belong to parameter space. Then a test defined as $\lambda=\lambda_n=\lambda(x_1,x_2,\cdots,x_n)=\frac{Sup_{\theta\varepsilon \Theta_0}L(\theta;x_1,x_2,\cdots,x_n)}{Sup_{\theta\varepsilon \Theta}L(\theta;x_1,x_2,\cdots,x_n)}$

 
 
 
 

5. For two estimators $T_1=t_1(X_1,X_2,\cdots,X_n)$ and $T_2=t_2(X_1,X_2,\cdots,X_n)$ then estimator $t_1$ is defined to be $R_{{t_1}(\theta)}\leq R_{{t_2}(\theta)}$ for all $\theta$ in $\Theta$

 
 
 
 

6. If $Var(T_2) < Var(T_1)$, then $T_2$ is

 
 
 
 

7. A test is said to be the most powerful test of size $\alpha$, if

 
 
 
 

8. If $Var(\hat{\theta})\rightarrow 0$ as $n \rightarrow 0$, then $\hat{\theta}$ is said to be

 
 
 
 

9. $Var_\theta (T) \geq \frac{[\tau'(\theta)]^2}{nE[{\frac{\partial}{\partial \theta}log f((X;\theta)}^2]}$, where $T=t(X_1,X_2,\cdots, X_n)$ is an unbiased estimator of $\tau(\theta)$. The above inequality is called

 
 
 
 

10. A set of jointly sufficient statistics is defined to be minimal sufficient if and only if

 
 
 
 

11. Which of the following statements describes an interval estimate?

 
 
 
 

12. Let $Z_1,Z_2,\cdots,Z_n$ be independently and identically distributed random variables, satisfying $E[|Z_t|]<\infty$. Let N be an integer-valued random variable whose value n depends only on the values of the first n $Z_i$s. Suppose $E(N)<\infty$, then $E(Z_1+Z_2+\cdots+Z_n)=E( N)E(Z_i)$ is called

 
 
 
 

13. What is the maximum expected difference between a population parameter and a sample estimate?

 
 
 
 

14. Which of the following assumptions are required to show the consistency, unbiasedness, and efficiency of the OLS estimator?

  1. $E(\mu_t)=0$
  2. $Var(\mu_t)=\sigma^2$
  3. $Cov(\mu_t,\mu_{t-j})=0;t\neq t-j$
  4. $\mu_t \sim N(0,\sigma^2)$
 
 
 
 

15. If the conditional distribution of $X_1, X_2,\cdots,X_n$ given $S=s$, does not depend on $\theta$, for any value of $S=s$, the statistics $S=s(X_1,X_2,\cdots,X_n)$ is called

 
 
 
 

16. What are the main components of a confidence interval?

 
 
 
 

17. If $E(\hat{\theta})=\theta$, then $\hat{\theta}$ is said to be

 
 
 
 

18. Let $X_1,X_2,\cdots,X_n$ be a random sample from a density $f(x|\theta)$, where $\theta$ is a value of the random variable $\Theta$ with known density $g_\Theta(\theta)$. Then the estimator $\tau(\theta)$ with respect to the prior $g_\Theta(\theta)$ is defined as $E[\tau(\theta)|X_1,X_2,\cdots,X_n]$ is called

 
 
 
 

19. For a biased estimator $\hat{\theta}$ of $\theta$, which one is correct

 
 
 
 

20. If $f(x_1,x_2,\cdots,x_n;\theta)=g(\hat{\theta};\theta)h(x_1,x_2,\cdots,x_n)$, then $\hat{\theta}$ is

 
 
 
 

Online Estimation MCQs with Answers

Online Estimation MCQs with Answers
  • If $Var(\hat{\theta})\rightarrow 0$ as $n \rightarrow 0$, then $\hat{\theta}$ is said to be
  • If $E(\hat{\theta})=\theta$, then $\hat{\theta}$ is said to be
  • If $Var(T_2) < Var(T_1)$, then $T_2$ is
  • If $f(x_1,x_2,\cdots,x_n;\theta)=g(\hat{\theta};\theta)h(x_1,x_2,\cdots,x_n)$, then $\hat{\theta}$ is
  • Which of the following assumptions are required to show the consistency, unbiasedness, and efficiency of the OLS estimator?
    i. $E(\mu_t)=0$
    ii. $Var(\mu_t)=\sigma^2$
    iii. $Cov(\mu_t,\mu_{t-j})=0;t\neq t-j$
    iv. $\mu_t \sim N(0,\sigma^2)$
  • For a biased estimator $\hat{\theta}$ of $\theta$, which one is correct
  • A test is said to be the most powerful test of size $\alpha$, if
  • In statistical inference, the best asymptotically normal estimator is denoted by
  • If the conditional distribution of $X_1, X_2,\cdots,X_n$ given $S=s$, does not depend on $\theta$, for any value of $S=s$, the statistics $S=s(X_1,X_2,\cdots,X_n)$ is called
  • A set of jointly sufficient statistics is defined to be minimal sufficient if and only if
  • If $X_1,X_2,\cdots, X_n$ is the joint density of n random variables, say, $f(X_1, X_2,\cdots, X_n;\theta)$ which is considered to be a function of $\theta$. Then $L(\theta; X_1,X_2,\cdots, X_n)$ is called
  • For two estimators $T_1=t_1(X_1,X_2,\cdots,X_n)$ and $T_2=t_2(X_1,X_2,\cdots,X_n)$ then estimator $t_1$ is defined to be $R_{{t_1}(\theta)}\leq R_{{t_2}(\theta)}$ for all $\theta$ in $\Theta$
  • Let $X_1,X_2,\cdots,X_n$ be a random sample from the density $f(x;\theta)$, where $\theta$ may be vector. If the conditional distribution of $X_1,X_2,\cdots,X_n$ given $S=s$ does not depend on $\theta$ for any value of $s$ of $S$, then statistic is called.
  • $Var_\theta (T) \geq \frac{[\tau'(\theta)]^2}{nE[{\frac{\partial}{\partial \theta}log f((X;\theta)}^2]}$, where $T=t(X_1,X_2,\cdots, X_n)$ is an unbiased estimator of $\tau(\theta)$. The above inequality is called
  • Let $X_1,X_2,\cdots,X_n$ be a random sample from a density $f(x|\theta)$, where $\theta$ is a value of the random variable $\Theta$ with known density $g_\Theta(\theta)$. Then the estimator $\tau(\theta)$ with respect to the prior $g_\Theta(\theta)$ is defined as $E[\tau(\theta)|X_1,X_2,\cdots,X_n]$ is called
  • Let $L(\theta;X_1,X_2,\cdots,X_n)$ be the likelihood function for a sample $X_1,X_2,\cdots, X_n$ having joint density $f(x_1,x_2,\cdots,x_n;\theta)$ where ? belong to parameter space. Then a test defined as $\lambda=\lambda_n=\lambda(x_1,x_2,\cdots,x_n)=\frac{Sup_{\theta\varepsilon \Theta_0}L(\theta;x_1,x_2,\cdots,x_n)}{Sup_{\theta\varepsilon \Theta}L(\theta;x_1,x_2,\cdots,x_n)}$
  • Let $Z_1,Z_2,\cdots,Z_n$ be independently and identically distributed random variables, satisfying $E[|Z_t|]<\infty$. Let N be an integer-valued random variable whose value $n$ depends only on the values of the first n $Z_i$s. Suppose $E(N)<\infty$, then $E(Z_1+Z_2+\cdots+Z_n)=E( N)E(Z_i)$ is called
  • What is the maximum expected difference between a population parameter and a sample estimate?
  • Which of the following statements describes an interval estimate?
  • What are the main components of a confidence interval?

R Frequently Asked Questions

Online MCQs Test Preparation Website

Important Estimation MCQs with Answers 2

The post is about Estimation MCQs with Answers. MCQs are all about statistical inference and cover the topics of estimation, estimator, point estimate, interval estimate, properties of a good estimator, unbiasedness, efficiency, sufficiency, Large sample, and sample estimation. There are 20 multiple-choice questions from the estimation section. Let us start with Estimation MCQs with Answers.

Please go to Important Estimation MCQs with Answers 2 to view the test

Estimation MCQs with Answers

Estimation MCQs with Answers
  • If $E(\hat{\theta})=\theta$ then $\hat{\theta}$ is called
  • A statistic $\hat{\theta}$ is said to be an unbiased estimator of $\theta$, if
  • The following statistics are unbiased
  • The following is an unbiased estimator of the population variance $\sigma^2$
  • In point estimation we get
  • The formula used to estimate a parameter is called
  • A specific value calculated from a sample is called
  • A function that is used to estimate a parameter is called
  • $1-\alpha$ is called
  • The level of confidence is denoted by
  • The other name of the significance level is
  • What will be the confidence level if the level of significance is 5% (0.05)
  • The probability that the confidence interval does not contain the population parameter is denoted by
  • The probability that the confidence interval does contain the parameter is denoted by
  • The way of finding the unknown value of the population parameter from the sample values by using a formula is called ——–.
  • There are four steps involved with constructing a confidence interval. What is typically the first one?
  • What happens as a sample size gets larger?
  • After identifying a sample statistic, what is the proper order of the next three steps of constructing a confidence interval?
  • Testing of hypothesis may be replaced by?
  • A point estimate is often insufficient. Why?
Statistics Help: Estimation MCQs with Answers

https://rfaqs.com

https://gmstat.com

Estimation of Population Parameters

Introduction to Estimation of Population Parameters

In statistics, estimating population parameters is important because it allows the researcher to conclude a population (whole group) by analyzing a small part of that population. The estimation of population parameters is done when the population under study is large enough. For example, instead of performing a census, a random sample from the population can be drawn. To draw some conclusions about the population, one can calculate the required sample statistic(s).

Important Terminologies

The following are some important terminologies to understand the concept of estimating the population parameters.

  • Population: The entire collection of individuals or items one is interested in studying. For instance, all the people living in a particular country.
  • Sample: A subgroup (or small portion) chosen from the population that represents the larger group.
  • Parameter: A characteristic that describes the entire population, such as the population mean, median, or standard deviation.
  • Statistic: A value calculated from the sample data used to estimate the population parameter. For example, the sample mean is an estimate of the population mean. It is the characteristics of the sample under study.

Various statistical methods are used to estimate population parameters with different levels of accuracy. The accuracy of the estimate depends on the size of the sample and how well the sample represents the population.

We use statistics calculated from the sample data as estimates for the population parameters.

Estimation of Population Parameters Sample Statistic, Population Parameter
  • Sample mean: is used to estimate the population mean. It is calculated by averaging the values of all observations in the sample, that is the sum of all data values divided by the total number of observations in the data.
  • Sample proportion: is used to estimate the population proportion (percentage). It represents the number of successes (events of interest) divided by the total sample size.
  • Sample standard deviation: is used to estimate the population standard deviation. It reflects how spread out the data points are in the sample.

Types of Estimates

There are two types of estimates:

Estimation of Population Parameters: Point Estimate and Interval Estimate
  • Point Estimate: A single value used to estimate the population parameter. The example of point estimates are:
    • The mean/average height of Boys in Colleges is 65 inches.
    • 65% of Lahore residents support a ban on cell phone use while driving.
  • Interval Estimate: It is a set of values (interval) that is supposed to contain the population parameter. Examples of interval estimates are:
    • The mean height of Boys in Colleges lies between 63.5 and 66.5 inches.
    • 65% ($\pm 3$% of Lahore residents support a ban on cell phone use during driving.

Some Examples

Estimation of population parameters is widely used in various fields of life. For example,

  • a company might estimate customer satisfaction through a sample survey,
  • a biologist might estimate the average wingspan of a specific bird species by capturing and measuring a small group.

https://rfaqs.com

https://gmstat.com