The degrees of freedom (df) or several degrees of freedom refers to the number of observations in a sample minus the number of (population) parameters being estimated from the sample data. All this means that the degrees of freedom are a function of both sample size and the number of independent variables. In other words, it is the number of independent observations out of a total of ($n$) observations.
Degrees of Freedom
In statistics, the degrees of freedom are considered as the number of values in a study that is free to vary. Degree of freedom example in real life; if you have to take ten different courses to graduate, and only ten different courses are offered, then you have nine degrees of freedom. Nine semesters you will be able to choose which class to take; the tenth semester, there will only be one class left to take – there is no choice, if you want to graduate, this is the concept of the degrees of freedom (df) in statistics.
Let a random sample of size $n$ be taken from a population with an unknown mean $\overline{X}$. The sum of the deviations from their means is always equal to zero i.e.$\sum_{i=1}^n (X_i-\overline{X})=0$. This requires a constraint on each deviation $X_i-\overline{X}$ used when calculating the variance.
\[S^2 =\frac{\sum_{i=1}^n (X_i-\overline{X})^2 }{n-1}\]
This constraint (restriction) implies that $n-1$ deviations completely determine the nth deviation. The $n$ deviations (and also the sum of their squares and the variance in the $S^2$ of the sample) therefore $n-1$ degrees of freedom.
A common way to think of df is the number of independent pieces of information available to estimate another piece of information. More concretely, the number of degrees of freedom is the number of independent observations in a sample of data that are available to estimate a parameter of the population from which that sample is drawn. For example, if we have two observations, when calculating the mean we have two independent observations; however, when calculating the variance, we have only one independent observation, since the two observations are equally distant from the mean.
Single sample: For $n$ observation one parameter (mean) needs to be estimated, which leaves $n-1$ degree of freedom for estimating variability (dispersion).
Two samples: There are a total of $n_1+n_2$ observations ($n_1$ for group1 and $n_2$ for group2) and two means need to be estimated, which leaves $n_1+n_2-2$ degree of freedom for estimating variability.
Regression with p predictors: There are $n$ observations with $p+1$ parameters that need to be estimated (regression coefficient for each predictor and the intercept). This leaves $n-p-1$ degrees of freedom of error, which accounts for the error degrees of freedom in the ANOVA table.
Several commonly encountered statistical distributions (Student’s t, Chi-Squared, F) have parameters that are commonly referred to as degrees of freedom. This terminology simply reflects that in many applications where these distributions occur, the parameter corresponds to the degrees of freedom of an underlying random vector. If $X_i; i=1,2,\cdots, n$ are independent normal $(\mu, \sigma^2)$ random variables, the statistic (formula) is $\frac{\sum_{i=1}^n (X_i-\overline{X})^2}{\sigma^2}$, follows a chi-squared distribution with $n-1$ degree of freedom. Here, the degree of freedom arises from the residual sum of squares in the numerator and in turn the $n-1$ degree of freedom of the underlying residual vector $X_i-\overline{X}$.