Coefficient of Determination Formula

Coefficient of Determination as a Link between Regression and Correlation Analysis

Coefficient of Determination Formula in Statistics

The R squared ($r^2$; the square of the correlation coefficient) shows the percentage of the total variation of the dependent variable ($Y$) that can be explained by the independent (explanatory) variable ($X$). For this reason, $r^2$ (r-squared) is sometimes called the coefficient of determination.

Since

\[r=\frac{\sum x_i y_y}{\sqrt{\sum x_i^2} \sqrt{\sum y_i^2}},\]

Coefficient of Determination Formula

\begin{align*}
r^2&=\frac{(\sum x_iy_i)^2}{(\sum x_i^2)(\sum y_i^2)}=\frac{\sum \hat{y}^2}{\sum y^2}\\
&=\frac{\text{Explained Variation}}{\text{Total Variation}}
\end{align*}

where $r$ shows the degree of covariability of $X$ and $Y$. Note that in the formula used here is in deviation form, that is, $x=X-\mu$ and $y=Y-\mu$.

The link of $r^2$ between regression and correlation analysis can be considered from these points.

Coefficient of Determination Formula
  • If all the observations lie on the regression line then there will be no scattered points. In other words, the total variation of variable $Y$ is explained completely by the estimated regression line, which shows that there would be no scatterness in the data points(or no unexplained variation). That is
    \[\frac{\sum e^2}{\sum y^2}=\frac{\text{Unexplained Variation}}{\text{Total Variation}}=0\]
    Hence, $r^2=r=1$.
  • If the regression line explains only part of the variation in variable $Y$ then there will be some explained variation, that is,
    \[\frac{\sum e^2}{\sum y^2}=\frac{\text{Unexplained Variation}}{\text{Total Variation}}>0\]
    then, $r^2$ will be smaller than 1.
  • If the regression line does not explain any part of the variation of variable $Y$, that is,
    \[\frac{\sum e^2}{\sum y^2}=\frac{\text{Unexplained Variation}}{\text{Total Variation}}=1\Rightarrow=\sum y^2 = \sum e^2\]
    then, $r^2=0$.

Because $r^2=1-\frac{\text{unexlained variation}}{\text{total variation}}$

Learn more about the Coefficient of Determination Formula and Definition in Statistics

Regression Model in R Programming Language

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading

Scroll to Top