Partial Correlation Coefficient (2012)

The Partial Correlation Coefficient measures the relationship between any two variables, where all other variables are kept constant i.e. controlling all other variables or removing the influence of all other variables. Partial correlation aims to find the unique variance between two variables while eliminating the variance from the third variable. The partial correlation technique is commonly used in “causal” modeling of fewer variables. The coefficient is determined in terms of the simple correlation coefficient among the various variables involved in multiple relationships.

Assumptions for computing the Partial Correlation Coefficient

The assumption for partial correlation is the usual assumption of Pearson Correlation:

  1. Linearity of relationships
  2. The same level of relationship throughout the range of the independent variable i.e. homoscedasticity
  3. Interval or near-interval data, and
  4. Data whose range is not truncated.

We typically conduct correlation analysis on all variables so that you can see whether there are significant relationships amongst the variables, including any “third variables” that may have a significant relationship to the variables under investigation.

This type of analysis helps to find the spurious correlations (i.e. correlations that are explained by the effect of some other variables) and to reveal hidden correlations, i.e. correlations masked by the impact of other variables. The partial-correlation coefficient $r_{xy.z}$ can also be defined as the correlation coefficient between residuals $dx$ and $dy$ in this model.

Partial Correlation Formula

Suppose we have a sample of $n$ observations $(x1_1,x2_1,x3_1), (x1_2,x2_2,x3_2), \cdots, (x1_n,x2_n,x3_n)$ from an unknown distribution of three random variables. We want to find the coefficient of partial correlation between $X_1$ and $X_2$ keeping $X_3$ constant which can be denoted by $r_{12.3}$ as the correlation between the residuals $x_{1.3}$ and $x_{2.3}$. The coefficient $r_{12.3}$ is a partial correlation of the 1st order.

\[r_{12.3}=\frac{r_{12}-r_{13} r_{23}}{\sqrt{1-r_{13}^2 } \sqrt{1-r_{23}^2 } }\]

Partial Correlation Coefficient

The coefficient of partial correlation between three random variables $X$, $Y$, and $Z$ can be denoted by $r_{x,y,z}$ and also be defined as the coefficient of correlation between $\hat{x}_i$ and $\hat{y}_i$ with
\begin{align*}
\hat{x}_i&=\hat{\beta}_{0x}+\hat{\beta}_{1x}z_i\\
\hat{y}_i&=\hat{\beta}_{0y}+\hat{\beta}_{1y}z_i\\
\end{align*}
where $\hat{\beta}_{0x}$ and $\hat{\beta_{1x}}$ are the least square estimators obtained by regressing $x_i$ on $z_i$ and $\hat{\beta}_{0y}$ and $\hat{\beta}_{1y}$ are the least square estimators obtained by regressing $y_i$ on $z_i$. Therefore by definition, the partial-correlation between of $x$ and $y$ by controlling $z$ is
\[r_{xy.z}=\frac{\sum(\hat{x}_i-\overline{x})(\hat{y}_i-\overline{y})}{\sqrt{\sum(\hat{x}_i-\overline{x})^2}\sqrt{\sum(\hat{y}_i-\overline{y})^2}}\]

Partial Correlation Analysis

It is determined in terms of the simple correlation coefficients among the various variables involved in a multiple relationship. It is a very helpful tool in the field of statistics for understanding the true underlying relationships between variables, especially when you are dealing with potentially confounding factors.

Reference

Yule, G. U. (1926). Why do we sometimes get non-sense correlations between time series? A study in sampling and the nature of time series. J. Roy. Stat. Soc. (2) 89, 1-64.

Learn R Programming Language

Pearson’s Correlation Coefficient SPSS (2012)

Pearson’s Correlation Coefficient SPSS

Pearson’s correlation coefficient (or correlation or simply correlation) is used to find the degree of linear relationship between two continuous variables. The value for a correlation coefficient lies between 0.00 (no correlation) and 1.00 (perfect correlation). Generally, correlations above 0.80 are considered pretty high.

Remember:

  1. Correlation is the interdependence of continuous variables, it does not refer to cause and effect.
  2. Correlation is used to determine the linear relationship between variables.
  3. Draw a scatter plot before performing/calculating the correlation (to check the assumptions of linearity)

How to Perform Pearson’s Correlation Coefficient SPSS

The command for correlation is found at Analyze –> Correlate –> Bivariate i.e.

Correlation Coefficient SPSS

The Bivariate Correlation Coefficient SPSS dialog box will be there:

Pearson's Correlation Coefficient SPSS

Select one of the variables that you want to correlate in the left-hand pane of the Bivariate Correlations dialog box and shift it into the Variables pane on the right-hand pan by clicking the arrow button. Now click on the other variable that you want to correlate in the left-hand pane and move it into the Variables pane by clicking on the arrow button

Pearson's Correlation Coefficient SPSS

Correlation Coefficient SPSS Output

Pearson's Correlation Coefficient SPSS

The Correlations table in the output gives the values of the specified correlation tests, such as Pearson’s correlation. Each row of the table corresponds to one of the variables similarly each column also corresponds to one of the variables.

Interpreting Correlation Coefficient

For example, the cell at the bottom row of the right column represents the correlation of depression with depression which is equal to 1.0. Likewise, the cell at the middle row of the middle column represents the correlation of anxiety with anxiety having a correlation value This in in both cases shows that anxiety is related to anxiety similarly depression is related to depression, so have the perfect relationship.

The cell in the middle row and right column (or the cell in the bottom row at the middle column) is more interesting. This cell represents the correlation between anxiety and depression (or depression with anxiety). There are three numbers in these cells.

  1. The top number is the correlation coefficient value which is 0.310.
  2. The middle number is the significance of this correlation which is 0.018.
  3. The bottom number, 46 is the number of observations that were used to calculate the correlation coefficient. between the variables of the study.

Note that the significance tells us whether we would expect a correlation that was this large purely due to chance factors and not due to an actual relation. In this case, it is improbable that we would get an r (correlation coefficient) this big if there was not a relation between the variables.

Online General Knowledge MCQs Test with Answers