Generalized Least Squares (GLS)
The usual OLS method assigns equal weight (or importance) to each observation. But generalized least squares (GLS) take such information into account explicitly and is therefore capable of producing estimators that are BLUE.
Consider following two-variable model,
\begin{align}
Y_i &= \beta_1 + \beta_2 X_i + u_i\nonumber\\
\text{or}\\
Y_i &= \beta_1X_{0i} + \beta_2 X_i + u_i, \tag*{(eq1)}
\end{align}
where $X_{0i}=1$ for each $i$.
Assume that the heteroscedastic variance $\sigma_i^2$ is known:
\begin{align}
\frac{Y_i}{\sigma_i} &= \beta_1 \left(\frac{X_{0i}}{\sigma_i} \right)+\beta_2 \left(\frac{X_i}{\sigma_i}\right) +\left(\frac{u_i}{\sigma_i}\right)\\\nonumber
Y_i^* &= \beta_i^* X_{0i}^* + \beta_2^* X_i^* + u_i^*, \tag*{(eq2)}
\end{align}
where stared variables (variables having star on them) are the original variable divided by the known $\sigma_i$. The stared coefficients are the parameters of transformed model, to distinguished them from OLS parameters $\beta_1$ and $\beta_2$.
\begin{align*}
Var(u_i^*) &=E(u_i^{2*})=E\left(\frac{u_i}{\sigma_i}\right)^2\\
&=\frac{1}{\sigma_i^2}E(u_i^2) \tag*{$\because E(u_i)=0$}\\
&=\frac{1}{\sigma_i^2}\sigma_i^2 \tag*{$\because E(u_i^2)=\sigma_i^2$}=1, \text{which is a constant.}
\end{align*}
The variance of the transformed $u_i^*$ is now homoscedastic. Applying OLS to the transformed model (eq2) will produce estimators that are BLUE, that is, $\beta_1^*$ and $\beta_2^*$ are now BLUE while $\hat{\beta}_1$ and $\hat{\beta}_2$ not.
The procedure of transforming the original variable in such a way that the transformed variables satisfy the assumption of the classical model and then applying OLS to them is known as the method of Generalized Least Squares (GLS).
GLS is OLS on the transformed variables that satisfy the standard LS assumptions. The estimators obtained are known as GLS estimators and are BLUE.
To obtain GLS estimator we minimize
\begin{align}
\sum \hat{u}_i^{*2} &= \sum \left(Y_i^* =\hat{\beta}_1^* X_{0i}^* – \hat{\beta}_2^* X_i^* \right)^2\nonumber\\
\text{That is}\\
\sum \left(\frac{\hat{u}_i}{\sigma_i}\right)^2 &=\sum \left[\frac{Y_i}{\sigma_i} – \hat{\beta}_1^* \left(\frac{X_{0i}}{\sigma_i}\right) -\hat{\beta}_2^*\left(\frac{X_i}{\sigma_i}\right) \right]^2 \tag*{(eq3)}\\
\sum w_i \hat{u}_i^2 &=\sum w_i(Y_i-\hat{\beta}_1^* X_{0i} -\hat{\beta}_2^*X_i)^2 \tag*{(eq4)}
\end{align}
The GLS estimator of $\hat{\beta}_2^*$ is
\begin{align*}
\hat{\beta}_2^* &= \frac{(\sum w_i)(\sum w_i X_iY_i)-(\sum w_i X_i)(\sum w_iY_i) }{(\sum w_i)(\sum w_iX_i^2)-(\sum w_iX_i)^2} \\
Var(\hat{\beta}_2^*) &=\frac{\sum w_i}{(\sum w_i)(\sum w_iX_i^2)-(\sum w_iX_i)^2},\\
\text{where $w_i=\frac{1}{\sigma_i^2}$}
\end{align*}
Difference between GLS and OLS
In GLS, a weighted sum of residuals squares is minimized with $w_i=\frac{1}{\sigma}_i^2$ acting as the weights, but in OLS an unweighted (or equally weighted residual sum of squares) is minimized. From equation (eq3), in GLS the weight assigned to each observation is inversely proportional to its $\sigma_i$, that is, observations coming from a population with larger $\sigma_i$ will get relatively smaller weight and those from a population with $\sigma_i$ will get proportionately larger weight in minimizing the RSS (eq4).
Since equation (eq4) minimized a weighted RSS, it is known as weighted least squares (WLS), and the estimators obtained are known as WLS estimators.
Read about Residual Plot