Consistency: A Property of Good Estimator

Consistency refers to the property of an estimator that as the sample size increases, the estimator converges in probability to the true value of the parameter being estimated. In other words, a consistent estimator will yield results that become more accurate and stable as more data points are collected.

Characteristics of a Consistent Estimator

A consistent has some important characteristics:

  • Convergence: The estimator will produce values that get closer to the true parameter value with larger samples.
  • Reliability: Provides reassurance that the estimates will be valid as more data is accounted for.

Examples of Consistent Estimators

  1. Sample Mean ($\overline{x}$): The sample mean is a consistent estimator of the population mean ($\mu$). A larger sample from a population converges to the actual population mean, compared to a smaller smaller.
  2. Sample Proportion ($\hat{p}$): The sample proportion is also a consistent estimator of the true population proportion. As the number of observations increases, the sample proportion gets closer to the true population proportion.

Question: $\hat{\theta}$ is a consistent estimator of the parameter $\theta$ of a given population if

  1. $\hat{\theta}$ is unbiased, and
  2. $var(\hat{\theta}) \rightarrow 0$ when $n\rightarrow \infty$

Answer: Suppose $X$ is random with mean $\mu$ and variance $\sigma^2$. If $X_1,X_2,\cdots,X_n$ is a random sample from $X$ then

\begin{align*}
E(\overline{X}) &= \mu\\
Var(\overline{X}) & = \frac{\sigma^2}{n}
\end{align*}

That is $\overline{X}$ is unbiased and $\lim\limits_{n\rightarrow\infty} Var(\overline{X}) = \lim\limits_{n\rightarrow\infty} \frac{\sigma^2}{n} =0$

Question: Show that the sample mean $\overline{X}$ of a random sample of size $n$ from the density function $f(x; \theta) = \frac{1}{\theta} e^{-\frac{x}{\theta}}, \qquad 0<x<\infty$ is a consistent estimator of the parameter $\theta$.

Answer: First, we need to check that $E(\overline{x})=\theta$, that is, the sample mean $\overline{X}$ is unbiased.

\begin{align*}
E(X) &= \mu = \int x\cdot f(x; \theta) dx = \int\limits_{0}^{\infty}x\cdot \frac{1}{\theta} e^{-\frac{x}{\theta}}dx\\
&= \frac{1}{\theta} \int\limits_{0}^{\infty} xe^{-\frac{x}{\theta}}dx\\
&= \frac{1}{\theta} \left[ \Big| -\theta x e^{-\frac{x}{\theta}}dx\Big|_{0}^{\infty} + \theta \int\limits_{0}^{\infty} e^{-\frac{x}{\theta}}dx \right]\\
&= \frac{1}{\theta} \left[0+\theta(-\theta) e^{-\frac{x}{\theta}}\big|_0^{\infty} \right] = \theta\\
E(X^2) &= \int x^2 f(x; \theta)dx = \int\limits_{0}^{\infty}x^2 \frac{1}{\theta} e^{-\frac{x}{\theta}}dx\\
&= \frac{1}{\theta}\left[ \Big| – x^2 \theta e^{-\frac{x}{\theta} }\Big|_{0}^{\infty} + \int\limits_0^\infty 2x\theta e^{-\frac{x}{\theta}}dx \right]\\
&= \frac{1}{\theta} \left[ 0 + 2\theta^2 \int\limits_0^\infty \frac{x}{\theta} e^{-\frac{x}{\theta}}dx\right]
\end{align*}

The expression is to be integrated into $E(X)$ which equals 0. Thus

\begin{align*}
E(X^2) &=\frac{1}{\theta} 2\theta^2\theta = 2\theta^2\\
Var(X) &=E(X^2) – [E(X)]^2 = 2\theta^2 – \theta^2 = \theta^2
and \quad Var(\overline{X}) &= \frac{\sigma^2}{n}\\
\lim\limits_{n\rightarrow \infty} \,\, Var(\overline{X}) &= \lim\limits_{n\rightarrow \infty} \frac{\sigma^2}{n} = 0
\end{align*}

Since $\overline{X}$ is unbiased and $Var(\overline{X})$ approaches 0 and $n\rightarrow \infty$, the $\overline{X}$ is a consistent estimator of $\theta$.

Importance of Consistency in Statistics

The following are a few key points about the importance of consistency in statistics:

Reliable Inferences: Consistent estimators ensure that as sample size increases, the estimates become closer and closer to the true population value/parameters. This helps researchers and statisticians to make sound inferences about a population based on sample data.

Foundation for Hypothesis Testing: Most of the statistical methods rely on consistent estimators. Consistency helps in validating the conclusions drawn from statistical tests, leading to confidence in decision-making.

Improved Accuracy: Since more data points are available due to the increase in sample size, the more consistently the estimates will converge to the true value. All this leads to more accurate statistical models, which can improve analysis and predictions.

Mitigating Sampling Error: Consistent estimators help to reduce the impact of random sampling error. As sample sizes increase, the variability in estimates tends to decrease, leading to more dependable conclusions.

Building Statistical Theory: Consistency is a fundamental concept in the development of statistical theory. It provides a rigorous foundation for designing and validating statistical methods and procedures.

Trust in Results: Consistency builds trust in the findings of statistical analyses. It is because the results are stable and reliable across different samples (due to large samples), therefore it is more likely to accept and act upon those results.

Framework for Model Development: In statistics and data science, developing models based on consistent estimators results in models with more accuracy.

Long-Term Decision Making: Consistency in data interpretation supports long-term planning, risk assessment, and resource allocation. It is required that businesses and organizations often make strategic decisions based on statistical analyses.

https://itfeature.com consistency a property of good estimator

R Frequently Asked Questions

MCQs Estimation Quiz 8

MCQs Estimation Quiz from Statistical Inference covers the topics of Estimation (Confidence Interval) and Bayes Factor for the preparation of exams and different statistical job tests in Government/ Semi-Government or Private Organization sectors. This test will also help get admission to different colleges and Universities. The online MCQS Estimation quiz will help the learner understand the related concepts and enhance their knowledge.

Online MCQs Estimation Quiz with Answers

1. Suppose that a research article indicates a value of $p = 0.30$ in the results section ($\alpha = 0.05$). You have absolutely proven the null hypothesis (that is, you have proven that there is no difference between the population means).

 
 
 

2. The probability of finding a significant result when there is no true effect is called ————– The probability of finding a significant result when there is a true effect, is called —————.

 
 
 
 

3. Suppose that a research article indicates a value of p = .30 in the results section ($\alpha = 0.05$). The probability that the given study’s results are replicable is not equal to $1-p$.

 
 
 

4. Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The value $p = 0.001$ does not directly confirm that the effect size was large.

 
 
 

5. If two 95% confidence intervals around the means overlap, then the difference between the two estimates is necessarily non-significant ($\alpha = 0.05$).

 
 
 

6. When a Bayesian t-test yields a $BF = 10$, it is ten times more likely that there is an effect than that there is no effect.

 
 
 

7. Suppose a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The probability that the given study’s results are replicable is not equal to $1-p$.

 
 
 

8. Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The p-value gives the probability of obtaining a significant result whenever a given experiment is replicated.

 
 
 

9. The likelihood ratio of the two hypotheses gives information about ————–, but not about —————-.

 
 
 
 

10. Two researchers are investigating if people can see in the future. Person A believes there is no effect, which would mean that p-values are distributed as a —————-. B finds a test statistic at the very far end of the distribution, which means that —————-.

 
 
 
 

11. A Bayes Factor close to 1 (inconclusive evidence) means that the effect size is small.

 
 
 

12. When a Bayesian t-test yields a $BF = 0.1$, it is ten times more likely that there is no effect than that there is an effect.

 
 
 

13. Suppose, the Bayesian method is used to estimate a population mean of 10 with a 95% credible interval from 8 to 12, which means ————–. This interval depends on —————.

 
 
 
 

14. An observed 95% confidence interval does not predict that 95% of the estimates from future studies will fall inside the observed interval.

 
 
 

15. To conclude that the difference between the two estimates is non-significant ($\alpha = 0.05$), the two 95% confidence intervals around the means do not overlap.

 
 
 

16. A Bayes Factor that provides strong evidence for the null model does not mean the null hypothesis is true.

 
 
 

17. Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The p-value of a statistical test is the probability of the observed result or a more extreme result, assuming the null hypothesis is true.

 
 
 

18. How are the three paths to statistical inference (frequentist, likelihood, Bayesian) related to each other?

 
 
 
 

19. A Bayes Factor that provides strong evidence for the alternative model does not mean the alternative hypothesis is true.

 
 
 

20. The specific 95% confidence interval observed in a study has a 95% chance of containing the true effect size.

 
 
 

MCQs Estimation Quiz with Answers

  • An observed 95% confidence interval does not predict that 95% of the estimates from future studies will fall inside the observed interval.
  • Suppose a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The probability that the given study’s results are replicable is not equal to $1-p$.
  • Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The value $p = 0.001$ does not directly confirm that the effect size was large.
  • Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The p-value of a statistical test is the probability of the observed result or a more extreme result, assuming the null hypothesis is true.
  • The specific 95% confidence interval observed in a study has a 95% chance of containing the true effect size.
  • A Bayes Factor close to 1 (inconclusive evidence) means that the effect size is small.
  • To conclude that the difference between the two estimates is non-significant ($\alpha = 0.05$), the two 95% confidence intervals around the means do not overlap.
  • If two 95% confidence intervals around the means overlap, then the difference between the two estimates is necessarily non-significant ($\alpha = 0.05$).
  • Suppose that a research article indicates a value of p = .30 in the results section ($\alpha = 0.05$). The probability that the given study’s results are replicable is not equal to $1-p$.
  • Suppose that a research article indicates a value of $p = 0.30$ in the results section ($\alpha = 0.05$). You have absolutely proven the null hypothesis (that is, you have proven that there is no difference between the population means).
  • How are the three paths to statistical inference (frequentist, likelihood, Bayesian) related to each other?
  • Two researchers are investigating if people can see in the future. Person A believes there is no effect, which would mean that p-values are distributed as a —————-. B finds a test statistic at the very far end of the distribution, which means that —————-.
  • The probability of finding a significant result when there is no true effect is called ————– The probability of finding a significant result when there is a true effect, is called —————.
  • The likelihood ratio of the two hypotheses gives information about ————–, but not about —————-.
  • When a Bayesian t-test yields a $BF = 10$, it is ten times more likely that there is an effect than that there is no effect.
  • A Bayes Factor that provides strong evidence for the alternative model does not mean the alternative hypothesis is true.
  • When a Bayesian t-test yields a $BF = 0.1$, it is ten times more likely that there is no effect than that there is an effect.
  • Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The p-value gives the probability of obtaining a significant result whenever a given experiment is replicated.
  • A Bayes Factor that provides strong evidence for the null model does not mean the null hypothesis is true.
  • Suppose, the Bayesian method is used to estimate a population mean of 10 with a 95% credible interval from 8 to 12, which means ————–. This interval depends on —————.
Online MCQs Estimation Quiz with Answers

Statistical inference is a branch of statistics in which we conclude (make some wise decisions) about the population parameter using sample information. Statistical inference can be further divided into the Estimation of the Population Parameters and the Hypothesis Testing.

Estimation is a way of finding the unknown value of the population parameter from the sample information by using an estimator (a statistical formula) to estimate the parameter. One can estimate the population parameter by using two approaches (I) Point Estimation and (ii) Interval Estimation.

R Programming Language

Unbiasedness

Unbiasedness is a statistical concept that describes the accuracy of an estimator. An estimator is said to be an unbiased estimator if its expected value (or average value over many samples) equals the corresponding population parameter, that is, $E(\hat{\theta}) = \theta$.

If the expected value of an estimator $\theta$ is not equal to the corresponding parameter then the estimator will be biased. The bias of an estimator of $\hat{\theta}$ can be defined as

$$Bias = E(\hat{\theta}) – \theta$$

Note that $\overline{X}$ is an unbiased estimator of the mean of a population. Therefore,

  • $\overline{X}$ is an unbiased estimator of the parameter $\mu$ in Normal distribution.
  • $\overline{X}$ is an unbiased estimator of the parameter $p$ in the Bernoulli distribution.
  • $\overline{X}$ is an unbiased estimator of the parameter $\lambda$ in the Poisson distribution.
Unbiasedness, positive bias, negative bias, unbiased

However, the expected value of the sample variance $S^2=\frac{\sum\limits_{i=1}^n (X_i – \overline{X})^2 }{n}$ is not equal to the population variance, that is $E(S^2) = \sigma^2$.

Therefore, sample variance is not an unbiased estimator of the population variance $\sigma^2$.

Note that it is possible to have more than one unbiased estimator for an unknown parameter. For example, the sample mean and sample median are both unbiased estimators of the population mean $\mu$ if the population distribution is symmetrical.

Question: Show that the sample mean is an unbiased estimator of the population mean.

Solution:

Let $X_1, X_2, \cdots, X_n$ be a random sample of size $n$ from a population having mean $\mu$. The sample mean is $\overline{X}$ is

$$\overline{X} = \frac{1}{n} \sum\limits_{i=1}^n X_i$$

We must show that $E(\overline{X})=\mu$, therefore, taking the expectation on both sides,

\begin{align*}
E(\overline{X}) &= E\left[\frac{1}{n} \Sigma X_i \right]\\
&= \frac{1}{n} E(X_i) = \frac{1}{n} E(X_1 + X_2 + \cdots + X_n)\\
&= \frac{1}{n} \left[E(X_1) + E(X_2) + \cdots + E(X_n) \right]
\end{align*}

Since, in the random sample, the random variables $X_1, X_2, \cdots, X_n$ are all independent and each has the same distribution of the population, then $E(X_1)=E(X_2)=\cdots=E(X_n)$. So,

$$E(\overline{x}) = \frac{1}{n}(\mu+\mu+\cdots + \mu) = \mu$$

Why Unbiasedness is Important

  • Accuracy: Unbiasedness is a measure of accuracy, not precision. Unbiased estimators provide accurate estimates on average, reducing the risk of systematic errors. However, an unbiased estimator can still have a large variance, meaning its individual estimates can be far from the true value.
  • Consistency: An unbiased estimator is not necessarily consistent. Consistency refers to the tendency of an estimator to converge to the true value as the sample size increases.
  • Foundation for Further Analysis: Unbiased estimators are often used as building blocks for more complex statistical procedures.

Unbiasedness Example

Imagine you’re trying to estimate the average height of students in your university. If you randomly sample 100 students and calculate their average height, this average is an estimator of the true average height of all students in that university. If this average height is consistently equal to the true average height of the entire student population, then your estimator is unbiased.

Unbiasedness is the state of being free from bias, prejudice, or favoritism. It can also mean being able to judge fairly without being influenced by one’s own opinions. In statistics, it also refers to (i) A sample that is not affected by extraneous factors or selectivity (ii) An estimator that has an expected value that is equal to the parameter being estimated.

Applications and Uses of Unbiasedness

  • Parameter Estimation:
    • Mean: The sample mean is an unbiased estimator of the population mean.
    • Variance: The sample variance, with a slight adjustment (Bessel’s correction), is an unbiased estimator of the population variance.
    • Regression Coefficients: In linear regression, the ordinary least squares (OLS) estimators of the regression coefficients are unbiased under certain assumptions.
  • Hypothesis Testing:
    • Unbiased estimators are often used in hypothesis tests to make inferences about population parameters. For example, the t-test for comparing means relies on the assumption that the sample means are unbiased estimators of the population means.
  • Machine Learning: In some machine learning algorithms, unbiased estimators are preferred for model parameters to avoid systematic errors.
  • Survey Sampling: Unbiased sampling techniques, such as simple random sampling, are used to ensure that the sample is representative of the population and that the estimates obtained from the sample are unbiased.

Online MCQs and Quiz Website

R Language FAQs and Interview Questions