Error and Residual in Regression 12 - Learn Basic Statistics

Post Views: 1,003

Error and Residual in Regression

In Statistics and Optimization, Statistical Errors and Residuals are two closely related and easily confused measures of “Deviation of a sample from the mean”.

Errors

Error is a misnomer; an error is the amount by which an observation differs from its expected value. The errors e are unobservable random variables, assumed to have zero mean and uncorrelated elements, each with common variance σ².

Residual

A Residual, on the other hand, is an observable estimate of the unobservable error. The residuals $\hat{e}$ are computed quantities with mean ${E(\hat{e})=0}$ and variance ${V(\hat{e})=\sigma^2 (1-H)}$.

Like the errors, each of the residuals has zero mean, but each residual may have a different variance. Unlike the errors, the residuals are correlated. The residuals are linear combinations of the errors. If the errors are normally distributed, so are the errors.

regression: Error and Residual in Regression

Note that the sum of the residuals is necessarily zero, and thus the residuals are necessarily not independent. The sum of the errors need not be zero; the errors are independent random variables if the individuals are chosen from the population independently.

Differences between Errors and Residuals in Regression

Sr. No.	Errors	Residuals
1)	Error represents the unobservable difference between an actual value $y$ of the dependent variable and its true population mean.	Residuals represent the observable difference between an actual value $y$ of the dependent variable and its predicted value according to the regression model.
2)	Error is a theoretical concept because the true population mean is usually unknown.	One can calculate residuals because we have the data and the fitted model.
3)	Errors are assumed to be random and independent, with a mean of zero.	Residuals are considered estimates of the errors for each data point.

Residuals are used in various ways to evaluate the regression model, including:

Residual plots: The residual plots are used to visualize the residuals versus the independent variable or predicted values.
Mean Squared Error (MSE): The MSE statistic measures the average squared difference between the residuals and zero.

In essence, understanding errors and residuals helps the researcher gauge how well the regression model captures the underlying relationship between variables, despite the inherent randomness or “noise” in real-world data.

FAQS about Errors and Residuals

What is an Error?
What are residuals in regression?
What is the purpose of residual plots?
What is a mean squared error (MSE)?
Differentiate between error and residual.
Discuss the sum of residuals and the sum of errors.

Learn about Simple Linear Regression Models

Statistical Models in R Language

Error and Residual in Regression