Sampling Distribution of Means

Suppose, we have a population of size $N$ having mean $\mu$ and variance $\sigma^2$. We draw all possible samples of size $n$ from this population with or without replacement. Then we compute the mean of each sample and denote it by $\overline{x}$. These means are classified into a frequency table which is called frequency distribution of means and the probability distribution of means is called the sampling distribution of means.

Sampling Distribution

A sampling distribution is defined as the probability distribution of the values of a sample statistic such as mean, standard deviation, proportions, or difference between means, etc., computed from all possible samples of size $n$ from a population. Some of the important sampling distributions are:

  • Sampling Distribution of Means
  • Sampling Distribution of the Difference Between Means
  • Sampling Distribution of the Proportions
  • Sampling Distribution of the Difference between Proportions
  • Sampling Distribution of Variances

Notations of Sampling Distribution of Means

The following notations are used for sampling distribution of means:

$\mu$: Population mean
$\sigma^2$: Population Variance
$\sigma$: Population Standard Deviation
$\mu_{\overline{X}}$: Mean of the Sampling Distribution of Means
$\sigma^2_{\overline{X}}$: Variance of Sampling Distribution of Means
$\sigma_{\overline{X}}$: Standard Deviation of the Sampling Distribution of Means

Formulas for Sampling Distribution of Means

The following following formulas for the computation of means, variance, and standard deviations can be used:

\begin{align*}
\mu_{\overline{X}} &= E(\overline{X}) = \Sigma (\overline{X}P(\overline{X})\\
\sigma^2_{\overline{X}} &= E(\overline{X}^2) – [E(\overline{X})]^2\\
\text{where}\\
E(\overline{X}^2) &= \Sigma \overline{X}^2P(\overline{X})\\
\sigma_{\overline{X}} &= \sqrt{E(\overline{X}^2) – [E(\overline{X})]^2}
\end{align*}

Numerical Example: Sampling Distribution of Means

A population of $(N=5)$ has values 2, 4, 6, 8, and 10. Draw all possible samples of size 2 from this population with and without replacement. Construct the sampling distribution of sample means. Find the mean, variance, and standard deviation of the population and verify the following:

Sr. No.Sampling with ReplacementSampling without Replacement
1)$\mu_{\overline{X}} = \mu$$\mu_{\overline{X}} = \mu$
2)$\sigma^2_{\overline{X}}=\frac{\sigma^2}{n}$$\sigma^2_{\overline{X}}=\frac{\sigma^2}{n}\frac{N-n}{N-1}$
3)$\sigma_{\overline{X}} = \frac{\sigma}{\sqrt{n}}$$\sigma_{\overline{X}} = \frac{\sigma}{\sqrt{n}} \sqrt{\frac{N-n}{N-1}}$

Solution

The solution to the above example is as follows:

Sampling with Replacement (Mean, Variance, and Standard Deviation)

The number of possible samples is: $N^n = 5^2 = 25.

Samples$\overline{X}$Samples$\overline{X}$Samples$\overline{X}$
2, 224, 1078, 88
2, 436, 248, 109
2, 646, 4510, 26
2, 856, 6610, 47
2, 1066, 8710, 68
4, 236, 10810, 89
4, 448, 2510, 1010
4, 658, 46
4, 868, 67

The sampling distribution of sample means will be

$\overline{X}$Freq$P(\overline{X}$$\overline{X}P(\overline{X})$$\overline{X}^2$$\overline{X}^2P(\overline{X}$
211/252/2544/25
322/256/25918/25
433/2512/251648/25
544/2520/2525100/25
655/2530/2536180/25
744/2528/2549196/25
833/2524/2564192/25
922/2518/2581162/25
10112510/25100100/25
Total25/25=1150/25 = 61000/25=40

\begin{align*}
\mu_{\overline{X}} &= E(\overline{X}) = \Sigma \left[\overline{X}P(\overline{X})\right] = \frac{150}{25}=6\\
\sigma^2_{\overline{X}} &= E(\overline{X}^2) – [E(\overline{X}]^2=\Sigma [\overline{X}^2P(\overline{X})] – [\Sigma [\overline{X}P(\overline{X})]]^2\\
&= 40 – 6^2 = 4\\
\sigma_{\overline{X}} &= \sqrt{4}=2
\end{align*}

Mean, Variance, and Standard Deviation for Population

The following are computations for population values.

$X$24681030
$X^2$4163664100220

\begin{align*}
\mu &= \frac{\Sigma}{N} = \frac{30}{5} = 6\\
\sigma^2 &= \frac{\Sigma X^2}{N} – \left(\frac{\Sigma X}{n} \right)^2\\
&=\frac{220}{5} – (6)^2 = 8\\
\sigma&= \sqrt{8} = 2.82
\end{align*}

Verifications:

  1. Mean: $\mu_{\overline{X}} = \mu \Rightarrow 6=6$
  2. Variance: $\sigma^2_{\overline{X}} = \frac{\sigma^2}{n} \Rightarrow 4=\frac{8}{2}$
  3. Standard Deviation: $\sigma_{\overline{X}}=\frac{\sigma}{\sqrt{n}} \Rightarrow 2=\frac{2.82}{\sqrt{2}}=2$

Sampling without Replacement

The possible samples for sampling without replacement are: $\binom{5}{2}=10$

Samples$\overline{x}$Samples$\overline{x}$
2, 434, 86
2, 644, 107
2, 856, 87
2, 1066, 108
4, 648, 109

The sampling distribution sample means for sampling without replacement is

$\overline{x}$Freq$P(\overline{x})$$\overline{x}P(\overline{x})$$\overline{x}^2$$\overline{x}^2P(\overline{x})$
311/103/1099/10
411/104/101616/10
522/1010/102550/10
622/1012/103672/10
722/1014/104998/10
811/108/106464/10
911/209/108181/10
Total10/10=160/10=6390/10 = 39

\begin{align*}
\mu_{\overline{X}} &= E(\overline{X}) = \Sigma \left[\overline{X}P(\overline{X})\right] = \frac{60}{10}=6\\
\sigma^2_{\overline{X}} &= E(\overline{X}^2) – [E(\overline{X}]^2=\Sigma [\overline{X}^2P(\overline{X})] – [\Sigma [\overline{X}P(\overline{X})]]^2\\
&= 39 – 6^2 = 3\\
\sigma_{\overline{X}} &= \sqrt{3}=1.73
\end{align*}

Verifications:

  1. Mean: $\mu_{\overline{X}} = \mu \Rightarrow 6=6$
  2. Variance: $\sigma^2_{\overline{X}} = \frac{\sigma^2}{n}\cdot \left(\frac{N-n}{N-1}\right) \Rightarrow 3=\frac{8}{2}\cdot\left(\frac{5-2}{5-1}\right)=3$
  3. Standard Deviation: $\sigma_{\overline{X}}=\frac{\sigma}{\sqrt{n}} \Rightarrow 1.73=\sqrt{3}$

Why is Sampling Distribution Important?

  • Inference: Sampling distribution of means allows users to make inferences about the population mean based on sample data.
  • Hypothesis Testing: It is crucial for hypothesis testing, where the researcher compares sample statistics to population parameters.
  • Confidence Intervals: It helps construct confidence intervals, which provide a range of values likely to contain the population mean.
Sampling Distribution of Means

Note that the sampling distribution of means provides a framework for understanding how sample means vary from sample to sample and how they relate to the population mean. This understanding is fundamental to statistical inference and decision-making.

R and Data Analysis, Online Quiz Website

Leave a Comment

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading