Suppose, we have a population of size $N$ having mean $\mu$ and variance $\sigma^2$. We draw all possible samples of size $n$ from this population with or without replacement. Then we compute the mean of each sample and denote it by $\overline{x}$. These means are classified into a frequency table which is called frequency distribution of means and the probability distribution of means is called the sampling distribution of means.
Table of Contents
Sampling Distribution
A sampling distribution is defined as the probability distribution of the values of a sample statistic such as mean, standard deviation, proportions, or difference between means, etc., computed from all possible samples of size $n$ from a population. Some of the important sampling distributions are:
- Sampling Distribution of Means
- Sampling Distribution of the Difference Between Means
- Sampling Distribution of the Proportions
- Sampling Distribution of the Difference between Proportions
- Sampling Distribution of Variances
Notations of Sampling Distribution of Means
The following notations are used for sampling distribution of means:
$\mu$: Population mean
$\sigma^2$: Population Variance
$\sigma$: Population Standard Deviation
$\mu_{\overline{X}}$: Mean of the Sampling Distribution of Means
$\sigma^2_{\overline{X}}$: Variance of Sampling Distribution of Means
$\sigma_{\overline{X}}$: Standard Deviation of the Sampling Distribution of Means
Formulas for Sampling Distribution of Means
The following following formulas for the computation of means, variance, and standard deviations can be used:
\begin{align*}
\mu_{\overline{X}} &= E(\overline{X}) = \Sigma (\overline{X}P(\overline{X})\\
\sigma^2_{\overline{X}} &= E(\overline{X}^2) – [E(\overline{X})]^2\\
\text{where}\\
E(\overline{X}^2) &= \Sigma \overline{X}^2P(\overline{X})\\
\sigma_{\overline{X}} &= \sqrt{E(\overline{X}^2) – [E(\overline{X})]^2}
\end{align*}
Numerical Example: Sampling Distribution of Means
A population of $(N=5)$ has values 2, 4, 6, 8, and 10. Draw all possible samples of size 2 from this population with and without replacement. Construct the sampling distribution of sample means. Find the mean, variance, and standard deviation of the population and verify the following:
Sr. No. | Sampling with Replacement | Sampling without Replacement |
---|---|---|
1) | $\mu_{\overline{X}} = \mu$ | $\mu_{\overline{X}} = \mu$ |
2) | $\sigma^2_{\overline{X}}=\frac{\sigma^2}{n}$ | $\sigma^2_{\overline{X}}=\frac{\sigma^2}{n}\frac{N-n}{N-1}$ |
3) | $\sigma_{\overline{X}} = \frac{\sigma}{\sqrt{n}}$ | $\sigma_{\overline{X}} = \frac{\sigma}{\sqrt{n}} \sqrt{\frac{N-n}{N-1}}$ |
Solution
The solution to the above example is as follows:
Sampling with Replacement (Mean, Variance, and Standard Deviation)
The number of possible samples is: $N^n = 5^2 = 25.
Samples | $\overline{X}$ | Samples | $\overline{X}$ | Samples | $\overline{X}$ |
---|---|---|---|---|---|
2, 2 | 2 | 4, 10 | 7 | 8, 8 | 8 |
2, 4 | 3 | 6, 2 | 4 | 8, 10 | 9 |
2, 6 | 4 | 6, 4 | 5 | 10, 2 | 6 |
2, 8 | 5 | 6, 6 | 6 | 10, 4 | 7 |
2, 10 | 6 | 6, 8 | 7 | 10, 6 | 8 |
4, 2 | 3 | 6, 10 | 8 | 10, 8 | 9 |
4, 4 | 4 | 8, 2 | 5 | 10, 10 | 10 |
4, 6 | 5 | 8, 4 | 6 | ||
4, 8 | 6 | 8, 6 | 7 |
The sampling distribution of sample means will be
$\overline{X}$ | Freq | $P(\overline{X}$ | $\overline{X}P(\overline{X})$ | $\overline{X}^2$ | $\overline{X}^2P(\overline{X}$ |
---|---|---|---|---|---|
2 | 1 | 1/25 | 2/25 | 4 | 4/25 |
3 | 2 | 2/25 | 6/25 | 9 | 18/25 |
4 | 3 | 3/25 | 12/25 | 16 | 48/25 |
5 | 4 | 4/25 | 20/25 | 25 | 100/25 |
6 | 5 | 5/25 | 30/25 | 36 | 180/25 |
7 | 4 | 4/25 | 28/25 | 49 | 196/25 |
8 | 3 | 3/25 | 24/25 | 64 | 192/25 |
9 | 2 | 2/25 | 18/25 | 81 | 162/25 |
10 | 1 | 125 | 10/25 | 100 | 100/25 |
Total | – | 25/25=1 | 150/25 = 6 | – | 1000/25=40 |
\begin{align*}
\mu_{\overline{X}} &= E(\overline{X}) = \Sigma \left[\overline{X}P(\overline{X})\right] = \frac{150}{25}=6\\
\sigma^2_{\overline{X}} &= E(\overline{X}^2) – [E(\overline{X}]^2=\Sigma [\overline{X}^2P(\overline{X})] – [\Sigma [\overline{X}P(\overline{X})]]^2\\
&= 40 – 6^2 = 4\\
\sigma_{\overline{X}} &= \sqrt{4}=2
\end{align*}
Mean, Variance, and Standard Deviation for Population
The following are computations for population values.
$X$ | 2 | 4 | 6 | 8 | 10 | 30 |
$X^2$ | 4 | 16 | 36 | 64 | 100 | 220 |
\begin{align*}
\mu &= \frac{\Sigma}{N} = \frac{30}{5} = 6\\
\sigma^2 &= \frac{\Sigma X^2}{N} – \left(\frac{\Sigma X}{n} \right)^2\\
&=\frac{220}{5} – (6)^2 = 8\\
\sigma&= \sqrt{8} = 2.82
\end{align*}
Verifications:
- Mean: $\mu_{\overline{X}} = \mu \Rightarrow 6=6$
- Variance: $\sigma^2_{\overline{X}} = \frac{\sigma^2}{n} \Rightarrow 4=\frac{8}{2}$
- Standard Deviation: $\sigma_{\overline{X}}=\frac{\sigma}{\sqrt{n}} \Rightarrow 2=\frac{2.82}{\sqrt{2}}=2$
Sampling without Replacement
The possible samples for sampling without replacement are: $\binom{5}{2}=10$
Samples | $\overline{x}$ | Samples | $\overline{x}$ |
---|---|---|---|
2, 4 | 3 | 4, 8 | 6 |
2, 6 | 4 | 4, 10 | 7 |
2, 8 | 5 | 6, 8 | 7 |
2, 10 | 6 | 6, 10 | 8 |
4, 6 | 4 | 8, 10 | 9 |
The sampling distribution sample means for sampling without replacement is
$\overline{x}$ | Freq | $P(\overline{x})$ | $\overline{x}P(\overline{x})$ | $\overline{x}^2$ | $\overline{x}^2P(\overline{x})$ |
---|---|---|---|---|---|
3 | 1 | 1/10 | 3/10 | 9 | 9/10 |
4 | 1 | 1/10 | 4/10 | 16 | 16/10 |
5 | 2 | 2/10 | 10/10 | 25 | 50/10 |
6 | 2 | 2/10 | 12/10 | 36 | 72/10 |
7 | 2 | 2/10 | 14/10 | 49 | 98/10 |
8 | 1 | 1/10 | 8/10 | 64 | 64/10 |
9 | 1 | 1/20 | 9/10 | 81 | 81/10 |
Total | – | 10/10=1 | 60/10=6 | – | 390/10 = 39 |
\begin{align*}
\mu_{\overline{X}} &= E(\overline{X}) = \Sigma \left[\overline{X}P(\overline{X})\right] = \frac{60}{10}=6\\
\sigma^2_{\overline{X}} &= E(\overline{X}^2) – [E(\overline{X}]^2=\Sigma [\overline{X}^2P(\overline{X})] – [\Sigma [\overline{X}P(\overline{X})]]^2\\
&= 39 – 6^2 = 3\\
\sigma_{\overline{X}} &= \sqrt{3}=1.73
\end{align*}
Verifications:
- Mean: $\mu_{\overline{X}} = \mu \Rightarrow 6=6$
- Variance: $\sigma^2_{\overline{X}} = \frac{\sigma^2}{n}\cdot \left(\frac{N-n}{N-1}\right) \Rightarrow 3=\frac{8}{2}\cdot\left(\frac{5-2}{5-1}\right)=3$
- Standard Deviation: $\sigma_{\overline{X}}=\frac{\sigma}{\sqrt{n}} \Rightarrow 1.73=\sqrt{3}$
Why is Sampling Distribution Important?
- Inference: Sampling distribution of means allows users to make inferences about the population mean based on sample data.
- Hypothesis Testing: It is crucial for hypothesis testing, where the researcher compares sample statistics to population parameters.
- Confidence Intervals: It helps construct confidence intervals, which provide a range of values likely to contain the population mean.
Note that the sampling distribution of means provides a framework for understanding how sample means vary from sample to sample and how they relate to the population mean. This understanding is fundamental to statistical inference and decision-making.
R and Data Analysis, Online Quiz Website