The article is about one way Analysis of Variance. In the analysis of variance, the total variation in the data of the sample is split up into meaningful components that measure different sources of variation. Each component yields an estimate of the population variance, and these estimates are tested for homogeneity by using the F-distribution.
One Way Classification (Single Factor Experiments)
The classification of observations based on a single criterion or factor is called a one-way classification.
In single factor experiments, independent samples are selected from $k$ populations, each with $n$ observations. For samples, the word treatment is used and each treatment has $n$ repetitions or replications. By treatment, we mean the fertilizers applied to the fields, the varieties of a crop sown, or the temperature and humidity to which an item is subjected in a production process. The collected data consisting of $kn$ observations ($k$ samples of $n$ observations each) can be presented as.
where
$X_{ij}$ is the $i$th observation receiving the $j$th treatment
$X_{\cdot j}=\sum\limits_{i=1}^n X_{ij}$ is the total observations receiving the $j$th treatment
$\overline{X}_{\cdot j}=\frac{X_{\cdot j}}{n}$ is the mean of the observations receiving the $j$th treatment
$X_{\cdot \cdot}=\sum\limits_{i=j}^n X_{\cdot j} = \sum\limits_{j=1}^k \sum\limits_{i=1}^n X_{ij}$ is the total of all observations
$\overline{\overline{X}} = \frac{X_{\cdot \cdot}}{kn}$ is the mean of all observations.
The $k$ treatments are assumed to be homogeneous, and the random samples taken from the same parent population are approximately normal with mean $\mu$ and variance $\sigma^2$.
One Way Analysis of Variance Model
The linear model on which the one way analysis of variance is based is
$$X_{ij} = \mu + \alpha_j + e_{ij}, \quad\quad i=1,2,\cdots, n; \quad j=1,2,\cdots, k$$
Where $X_{ij}$ is the $i$th observation in the $j$th treatment, $\mu$ is the overall mean for all treatments, $\alpha_j$ is the effect of the $j$th treatment, and $e_{ij}$ is the random error associated with the $i$th observation in the $j$th treatment.
The One Way Analysis of Variance model is based on the following assumptions:
- The model assumes that each observation $X_{ij}$ is the sum of three linear components
- The true mean effect $\mu$
- The true effect of the $j$th treatment $\alpha_j$
- The random error associated with the $j$th observation $e_{ij}$
- The observations to which the $k$ treatments are applied are homogeneous.
- Each of the $k$ samples is selected randomly and independently from a normal population with mean $\mu$ and variance $\sigma^2_e$.
- The random error $e_{ij}$ is a normally distributed random variable with $E(e_{ij})=0$ and $Var(e_{ij})=\sigma^2_{ij}$.
- The sum of all $k$ treatments effects must be zero $(\sum\limits_{j=1}^k \alpha_j =0)$.
Suppose you are comparing crop yields that were fertilized with different mixtures. The yield (numerical) is the dependent variable, and fertilizer type (categorical with 3 levels) is the independent variable. ANOVA helps you determine if the fertilizer mixtures have a statistically significant effect on the average yield.