# Coefficient of Determination Formula

Coefficient of Determination as a Link between Regression and Correlation Analysis

### Coefficient of Determination Formula in Statistics

The R squared ($r^2$; the square of the correlation coefficient) shows the percentage of the total variation of the dependent variable ($Y$) that can be explained by the independent (explanatory) variable ($X$). For this reason, $r^2$ (r-squared) is sometimes called the coefficient of determination.

Since

$r=\frac{\sum x_i y_y}{\sqrt{\sum x_i^2} \sqrt{\sum y_i^2}},$

Coefficient of Determination Formula

\begin{align*}
r^2&=\frac{(\sum x_iy_i)^2}{(\sum x_i^2)(\sum y_i^2)}=\frac{\sum \hat{y}^2}{\sum y^2}\\
&=\frac{\text{Explained Variation}}{\text{Total Variation}}
\end{align*}

where $r$ shows the degree of covariability of $X$ and $Y$. Note that in the formula used here is in deviation form, that is, $x=X-\mu$ and $y=Y-\mu$.

The link of $r^2$ between regression and correlation analysis can be considered from these points.

• If all the observations lie on the regression line then there will be no scattered points. In other words, the total variation of variable $Y$ is explained completely by the estimated regression line, which shows that there would be no scatterness in the data points(or no unexplained variation). That is
$\frac{\sum e^2}{\sum y^2}=\frac{\text{Unexplained Variation}}{\text{Total Variation}}=0$
Hence, $r^2=r=1$.
• If the regression line explains only part of the variation in variable $Y$ then there will be some explained variation, that is,
$\frac{\sum e^2}{\sum y^2}=\frac{\text{Unexplained Variation}}{\text{Total Variation}}>0$
then, $r^2$ will be smaller than 1.
• If the regression line does not explain any part of the variation of variable $Y$, that is,
$\frac{\sum e^2}{\sum y^2}=\frac{\text{Unexplained Variation}}{\text{Total Variation}}=1\Rightarrow=\sum y^2 = \sum e^2$
then, $r^2=0$.

Because $r^2=1-\frac{\text{unexlained variation}}{\text{total variation}}$