## Correlation Coeficient values lies between +1 and -1?

We know that the ratio of the explained variation to the total variation is called the coefficient of determination which is the square of the correlation coefficient. This ratio is non-negative, therefore denoted by $r^2$, thus

\begin{align*}

r^2&=\frac{\text{Explained Variation}}{\text{Total Variation}}\\

&=\frac{\sum (\hat{Y}-\overline{Y})^2}{\sum (Y-\overline{Y})^2}

\end{align*}

It can be seen that if the total variation is all explained, the ratio $r^2$ (Coefficient of Determination) is one and if the total variation is all unexplained then the explained variation and the ratio $r^2$ is zero.

The square root of the coefficient of determination is called the correlation coefficient, given by

\begin{align*}

r&=\sqrt{ \frac{\text{Explained Variation}}{\text{Total Variation}} }\\

&=\pm \sqrt{\frac{\sum (\hat{Y}-\overline{Y})^2}{\sum (Y-\overline{Y})^2}}

\end{align*}

and

\[\sum (\hat{Y}-\overline{Y})^2=\sum(Y-\overline{Y})^2-\sum (Y-\hat{Y})^2\]

therefore

\begin{align*}

r&=\sqrt{ \frac{\sum(Y-\overline{Y})^2-\sum (Y-\hat{Y})^2} {\sum(Y-\overline{Y})^2} }\\

&=\sqrt{1-\frac{\sum (Y-\hat{Y})^2}{\sum(Y-\overline{Y})^2}}\\

&=\sqrt{1-\frac{\text{Unexplained Variation}}{\text{Total Variation}}}=\sqrt{1-\frac{S_{y.x}^2}{s_y^2}}

\end{align*}

where $s_{y.x}^2=\frac{1}{n} \sum (Y-\hat{Y})^2$ and $s_y^2=\frac{1}{n} \sum (Y-\overline{Y})^2$

\begin{align*}

\Rightarrow r^2&=1-\frac{s_{y.x}^2}{s_y^2}\\

\Rightarrow s_{y.x}^2&=s_y^2(1-r^2)

\end{align*}

Since variances are non-negative

\[\frac{s_{y.x}^2}{s_y^2}=1-r^2 \geq 0\]

Solving for inequality we have

\begin{align*}

1-r^2 & \geq 0\\

\Rightarrow r^2 \leq 1\, \text{or}\, |r| &\leq 1\\

\Rightarrow & -1 \leq r\leq 1

\end{align*}

**Alternative Proof**

Since $\rho(X,Y)=\rho(X^*,Y^*)$ where $X^*=\frac{X-\mu_X}{\sigma_X}$ and $Y^*=\frac{Y-Y^*}{\sigma_Y}$

and as covariance is bi-linear and *X ^{*}*

*,Y*have zero mean and variance 1, therefore

^{*}\begin{align*}

\rho(X^*,Y^*)&=Cov(X^*,Y^*)=Cov\{\frac{X-\mu_X}{\sigma_X},\frac{Y-\mu_Y}{\sigma_Y}\}\\

&=\frac{Cov(X-\mu_X,Y-\mu_Y)}{\sigma_X\sigma_Y}\\

&=\frac{Cov(X,Y)}{\sigma_X \sigma_Y}=\rho(X,Y)

\end{align*}

We also know that the variance of any random variable is *≥0*, it could be zero i.e .*(Var(X)=0)* if and only if *X* is a constant (almost surely), therefore

\[V(X^* \pm Y^*)=V(X^*)+V(Y^*)\pm2Cov(X^*,Y^*)\]

As Var(*X ^{*}*)=1 and Var(

*Y*)=1, the above equation would be negative if $Cov(X^*,Y^*)$ is either greater than 1 or less than -1. Hence \[1\geq \rho(X,Y)=\rho(X^*,Y^*)\geq -1\].

^{*}If $\rho(X,Y )=Cov(X^*,Y^*)=1$ then $Var(X^*- Y ^*)=0$ making *X ^{*}*

*=Y*almost surely. Similarly, if $\rho(X,Y )=Cov(X^*,Y^*)=-1$ then

^{*}*X*

^{*}*=−Y*almost surely. In either case, Y would be a linear function of

^{*}*X*almost surely.

For proof with Cauchy-Schwarz Inequality please follow the link

We can see that the Correlation Coefficient values lie between -1 and +1.

Learn More about

**Learn more about**