Inverse Regression Analysis or Calibration

In most regression problems we have to determine the value of $Y$  corresponding to a given value of $X$. The inverse of this problem is also called inverse regression analysis or calibration.

Inverse Regression Analysis

For inverse regression analysis, let the known values represented by matrix $X$ and their corresponding values by vector $Y$, which both form a simple linear regression model. Let, there is an unknown value of $X$, such as $X_0$, which cannot be measured and we observe the corresponding value of $Y$, say $Y_0$. Then, $X_0$ can be estimated and a confidence interval for $X_0$ can be obtained.

In regression analysis, we want to investigate the relationship between variables. Regression has many applications, which occur in many fields: engineering, economics, the physical and chemical sciences, management, biological sciences, and social sciences. We only consider the simple linear regression model, which is a model with one regressor $X$ that has a linear relationship with a response $Y$. It is not always easy to measure the regressor $X$ or the response $Y$.

Let us consider a typical example of this problem. If $X$ is the concentration of glucose in certain substances, then a spectrophotometric method is used to measure the absorbance. This absorbance depends on the concentration of $X$. The response $Y$ is easy to measure with the spectrophotometric method, but the concentration, on the other hand, is not easy to measure. If we have $n$ known concentrations, then the absorbance can be measured.

If there is a linear relation between $Y$ and $X$, then a simple linear regression model can be made with these data. Suppose we have an unknown concentration, that is difficult to measure, but we can measure the absorbance of this concentration. Is it possible to estimate this concentration with the measured absorbance? This is called the calibration problem or inverse regression Analysis.

Suppose, we have a linear model $Y=\beta_0+\beta_1X+e$ and we have an observed value of the response $Y$, but we do not have the corresponding value of X. How can we estimate this value of $X$? The two most important methods to estimate $X$ are the classical method and the inverse method.

The classical method of inverse regression analysis is based on the simple linear regression model

$Y=\hat{\beta}_0+\hat{\beta}_1X+\varepsilon,$   where $\varepsilon \tilde N(0, \, \sigma^2)$

where the parameters $\hat{beta}_0$ and $\hat{beta}_1$ are estimated by Least Squares as $\beta_0$ and $\beta_1$. At least two of the $n$ values of $X$ have to be distinct, otherwise, we cannot fit a reliable regression line. For a given value of $X$, say $X_0$ (unknown), a $Y$ value, say $Y_0$ (or a random sample of $k$ values of $Y$) is observed at the $X_0$ value. For inverse regression analysis, the problem is to estimate $X_0$. The classical method uses a $Y_0$ value (or the mean of $k$ values of $Y_0$) to estimate $X_0$, which is then estimated by $\hat{x_0}=\frac{\hat{Y_0}-\hat{\beta_0}} {\hat{\beta_1}}$.

scatter with regression line: Inverse Regression Analysis

The inverse estimator is the simple linear regression of $X$ on $Y$. In this case, we have to fit the model

\[X=a_0+a_1Y+e, \text{where }\, N(0, \sigma^2)\]

to obtain the estimator. Then the inverse estimator of $X_0$

\[X_0=a_0+a_1Y+e\]

Learn R Language Programming