The sum of Squared Deviations from Mean (2015)

In statistics, the sum of squared deviations is a measure of the total variability (Measure of spread or variation) within a data set. In other words, the sum of squares is a measure of deviation or variation from the mean (average) value of the given data set. A sum of squares is calculated by first computing the differences between each data point (observation) and the mean of the data set, i.e. $x=X-\overline{X}$. The computed $x$ is known as the deviation score for the given data set. Squaring each of these deviation scores and then adding these squared deviation scores gave us the sum of squared deviation (SS), which is represented mathematically as


Note that the small letter $x$ usually represents the deviation of each observation from the mean value, while the capital letter $X$ represents the variable of interest in statistics.

The Sum of Squared Deviations Example

Consider the following data set {5, 6, 7, 10, 12}. To compute the sum of squares of this data set, follow these steps

  • Calculate the average of the given data by summing all the values in the data set and then divide this sum of numbers by the total number of observations in the data set. Mathematically, it is $\frac{\sum X_i}{n}=\frac{40}{5}=8$, where 40 is the sum of all numbers $5+6+7+10+12$ and there are 5 observations in number.
  • Calculate the difference of each observation in the data set from the average computed in step 1, for the given data. The differences are
    $5 – 8 = –3$; $6 – 8 = –2$; $7 – 8 = –1$; $10 – 8 =2$ and $12 – 8 = 4$
    Note that the sum of these differences should be zero. $(–3 + –2 + –1 + 2 +4 = 0)$
  • Now square each of the differences obtained in step 2. The square of these differences are
    9, 4, 1, 4 and 16
  • Now add the squared number obtained in step 3. The sum of these squared quantities will be $9 + 4 + 1 + 4 + 16 = 34$, which is the sum of the square of the given data set.
Sum of Squared Deviations

In statistics, the sum of squares occurs in different contexts such as

  • Partitioning of Variance (Partition of Sums of Squares)
  • The sum of Squared Deviations (Least Squares)
  • The sum of Squared Differences (Mean Squared Error)
  • The sum of Squared Error (Residual Sum of Squares)
  • The sum of Squares due to Lack of Fit (Lack of Fit Sum of Squares)
  • The sum of Squares for Model Predictions (Explained Sum of Squares)
  • The sum of Squares for Observations (Total Sum of Squares)
  • The sum of Squared Deviation (Squared Deviations)
  • Modeling involving the Sum of Squares (Analysis of Variance)
  • Multivariate Generalization of Sum of Square (Multivariate Analysis of Variance)

As previously discussed, the Sum of Squares is a measure of the Total Variability of a set of scores around a specific number.

Online MCQs Test Website

R Faqs

Leave a Comment

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading