## F Distribution: Ratios of two Independent Estimators

*F*-distribution is a continuous probability distribution (also known as Snedecor’s *F* distribution or the Fisher-Snedecor distribution) which is named in honor of **R.A. Fisher** and **George W. Snedecor**. This distribution arises frequently as the null distribution of a test statistic (hypothesis testing), used to develop confidence interval and in the analysis of variance for comparison of several population means.

If $s_1^2$ and $s_2^2$ are two unbiased estimates of the population variance* σ ^{2 }*obtained from independent samples of size

*n*and

_{1 }*n*respectively from the same normal population, then the mathematically

_{2}*F*-ratio is defined as

\[F=\frac{s_1^2}{s_2^2}=\frac{\frac{(n_1-1)\frac{s_1^2}{\sigma^2}}{v_1}}{\frac{(n_2-1)\frac{s_2^2}{\sigma^2}}{v_2}}\]

where

*v*and

_{1}=n_{1}-1*v*. Since $\chi_1^2=(n_1-1)\frac{s_1^2}{\sigma^2}$ and $\chi_2^2=(n_2-1)\frac{s_2^2}{\sigma^2}$ are distributed independently as $\chi^2$ with $v_1$ and $v_2$ degree of freedom respectively, we have

_{2}=n_{2}-1\[F=\frac{\frac{\chi_1^2}{v_1}}{\frac{\chi_2^2}{v_2}}\]

So, *F* Distribution is the ratio of two independent Chi-square ($\chi^2$) statistics each divided by their respective degree of freedom.

**Properties**

**Properties**

- This takes only non-negative values since the numerator and denominator of the
*F*-ratio are squared quantities.

- The range of
*F*values is from 0 to infinity. - The shape of the
*F*-curve depends on the parameters*v*and_{1}*v*(its nominator and denominator df). It is non-symmetrical and skewed to the right (positive skewed) distribution. It tends to become more and more symmetric when one or both of the parameter values (_{2}*v*) increases, as shown in the following figure._{1}, v_{2}

- It is asymptotic. As
*X*values increases, the F-curve approaches the X-axis but never cross it or touch it (a similar behavior to the normal probability distribution). *F*have a unique mode at the value \[\tilde{F}=\frac{v_2(v_2-2)}{v_1(v_2+2)},\quad (v_2>2)\] which is always less than unity.- The mean of
*F*is $\mu=\frac{v_2}{v_2-2},\quad (v_2>2)$ - The variance of
*F*is \[\sigma^2=\frac{2v_2^2(v_1+v_2-2)}{v_1(v_2-2)(v_2-4)},\quad (v_2>4)\]

*Assumptions of F-distribution*

*Assumptions of F-distribution*

Statistical procedure of comparing the variances of two population have assumptions

- The two population (from which the samples are drawn) follows Normal distribution
- The two samples are random samples drawn independently from their respective populations.

Statistical procedure of comparing three or more populations means have assumptions

- The population follow the Normal distribution
- The population have equal standard deviations
*σ* - The populations are independent from each other.

*Note*

*Note*

This distribution is relatively insensitive to violations of the assumptions of normality of the parent population or the assumption of equal variances.

*Use of F-Distribution Table*

*Use of F-Distribution Table*

For given (specified) level of significance *α*, $F_\alpha(v_1,v_2)$ symbol is used to represent the upper (right hand side) 100% point of an F distribution having *v _{1}* and

*v*df.

_{2}The lower (left hand side) percentage point can be found by taking the reciprocal of *F*-value corresponding to upper (right hand side) percentage point, but number of df are interchanged i.e. \[F_{1-\alpha}(v_1,v_2)=\frac{1}{F_\alpha(v_2,v_1)}\]

The distribution for the variable *F* is given by

\[Y=k.F^{(\frac{v_1}{2})-1}\left(1+\frac{F}{v_2}\right)^{-\frac{(v_1+v_2)}{2}}\]

**References:**