*F*-distribution is a continuous probability distribution (also known as Snedecor’s *F* distribution or the Fisher-Snedecor distribution) which is named in honor of **R.A. Fisher** and **George W. Snedecor**. This distribution arises frequently as the null distribution of a test statistic (hypothesis testing), used to develop confidence interval and in the analysis of variance for comparison of several population means.

## Table of Contents

If $s_1^2$ and $s_2^2$ are two unbiased estimates of the population variance $\sigma^2$ obtained from independent samples of size *n _{1 }* and

*n*respectively from the same normal population, then the mathematically

_{2}*F*-ratio is defined as

\[F=\frac{s_1^2}{s_2^2}=\frac{\frac{(n_1-1)\frac{s_1^2}{\sigma^2}}{v_1}}{\frac{(n_2-1)\frac{s_2^2}{\sigma^2}}{v_2}}\]

where $v_1=n_1-1$ and $v_2=n_2-1$. Since $\chi_1^2=(n_1-1)\frac{s_1^2}{\sigma^2}$ and $\chi_2^2=(n_2-1)\frac{s_2^2}{\sigma^2}$ are distributed independently as $\chi^2$ with $v_1$ and $v_2$ degree of freedom respectively, we have

\[F=\frac{\frac{\chi_1^2}{v_1}}{\frac{\chi_2^2}{v_2}}\]

So, *F* Distribution is the ratio of two independent Chi-square ($\chi^2$) statistics each divided by their respective degree of freedom.

### F Distribution Properties

- This takes only non-negative values since the numerator and denominator of the
*F*-ratio are squared quantities.

- The range of
*F*values is from 0 to infinity. - The shape of the
*F*-curve depends on the parameters*v*and_{1}*v*(its nominator and denominator df). It is non-symmetrical and skewed to the right (positive skewed) distribution. It tends to become more and more symmetric when one or both of the parameter values (_{2}*v*) increase, as shown in the following figure._{1}, v_{2}

- It is asymptotic. As
*X*values increase, the F-curve approaches the X-axis but never crosses it or touches it (similar behavior to the normal probability distribution). *F*have a unique mode at the value \[\tilde{F}=\frac{v_2(v_2-2)}{v_1(v_2+2)},\quad (v_2>2)\] which is always less than unity.- The mean of
*F*is $\mu=\frac{v_2}{v_2-2},\quad (v_2>2)$ - The variance of
*F*is \[\sigma^2=\frac{2v_2^2(v_1+v_2-2)}{v_1(v_2-2)(v_2-4)},\quad (v_2>4)\]

### Assumptions of F Distribution

The statistical procedure of comparing the variances of two populations has assumptions

- The two populations (from which the samples are drawn) follow Normal distribution
- The two samples are random samples drawn independently from their respective populations.

The statistical procedure of comparing three or more populations has assumptions

- The population follows the Normal distribution
- The population has equal standard deviations
*Ïƒ* - The populations are independent of each other.

### Note

This distribution is relatively insensitive to violations of the assumptions of normality of the parent population or the assumption of equal variances.

### Use of F Distribution Table

For a given (specified) level of significance Î±, $F_\alpha(v_1,v_2)$ symbol is used to represent the upper (right-hand side) 100% point of an F distribution having $v_1$ and $v_2$ df.

The lower (left-hand side) percentage point can be found by taking the reciprocal of the F-value corresponding to the upper (right-hand side) percentage point, but the number of df is interchanged i.e. \[F_{1-\alpha}(v_1,v_2)=\frac{1}{F_\alpha(v_2,v_1)}\]

The distribution for the variable *F* is given by

\[Y=k.F^{(\frac{v_1}{2})-1}\left(1+\frac{F}{v_2}\right)^{-\frac{(v_1+v_2)}{2}}\]

**References:**

- https://en.wikibooks.org/wiki/Statistics/Distributions/F
- https://en.wikipedia.org/wiki/F-distribution
- https://www.itl.nist.gov/div898/handbook/eda/section3/eda3665.htm

Learn R Programming Language