To understand the First Order Autocorrelation, consider the multiple regression model as described below
$$Y_t=\beta_1+\beta_2 X_{2t}+\beta_3 X_{3t}+\cdots+\beta_k X_{kt}+u_t,$$
In the model above the current observation of the error term ($u_t$) is a function of the previous (lagged) observation of the error term ($u_{t-1}$). That is,
\begin{align*}
u_t = \rho u_{t-1} + \varepsilon_t, \tag*{eq 1}
\end{align*}
where $\rho$ is the parameter depicting the functional relationship among observations of the error term $u_t$ and $\varepsilon_t$ is a stochastic error term which is iid (identically independently distributed). It satisfies the standard OLS assumption:
\begin{align*}
E(\varepsilon) &=0\\
Var(\varepsilon) &=\sigma_t^2\\
Cov(\varepsilon_t, \varepsilon_{t+s} ) &=0
\end{align*}
Note if $\rho=1$, then all these assumptions are undefined.
The scheme (eq1) is known as a Markov first-order autoregressive scheme, usually denoted by AR(1). The eq1 is interpreted as the regression of $u_t$ on itself tagged on period. It is first-order because $u_t$ and its immediate past value are involved. Note the $Var(u_t)$ is still homoscedasticity under the AR(1) scheme.
The coefficient $\rho$ is called the first order autocorrelation coefficient (also called the coefficient of autocovariance) and takes values from -1 to 1 or ($|\rho|<1$). The size of $\rho$ determines the strength of autocorrelation (serial correlation). There are three different cases:
- If $\rho$ is zero, then there is no autocorrelation because $u_t=\varepsilon_t$.
- If $\rho$ approaches 1, the value of the previous observation of the error ($u_t-1$) becomes more important in determining the value of the current error term ($u_t$), and therefore, greater positive autocorrelation exists. The negative error term will lead to negative and positive will lead to a positive error term.
- If $\rho$ approaches -1, there is a very high degree of negative autocorrelation. The signs of the error term tend to switch signs from negative to positive and vice versa in consecutive observations.

First Order Autocorrelation AR(1)
\begin{align*}
u_t &= \rho u_{t-1}+\varepsilon_t\\
E(u_t) &= \rho E(u_{t-1})+ E(\varepsilon_t)=0\\
Var(u_t)&=\rho^2 Var(u_{t-1}+var(\varepsilon_t)\\
\text{Because $u$’s and $\varepsilon$’s are uncorrelated}\\
Var(u_t)&=\sigma^2\\
Var(u_{t-1}) &=\sigma^2\\
Var(\varepsilon_t)&=\sigma_t^2\\
\Rightarrow Var(u_t) &=\rho^2 \sigma^2+\sigma_t^2\\
\Rightarrow \sigma^2-\rho^2\sigma^2 &=\sigma_t^2\\
\Rightarrow \sigma^2(1-\rho^2)&=\sigma_t^2\\
\Rightarrow Var(u_t)&=\sigma^2=\frac{\sigma_t^2}{1-\rho^2}
\end{align*}
For covariance, multiply equation (eq1) by $u_{t-1}$ and taking the expectations on both sides
\begin{align*}
u_t\cdot u_{t-1} &= \rho u_{t-1} \cdot u_{t-1} + \varepsilon_t \cdot u_{t-1}\\
E(u_t u_{t-1}) &= E[\rho u_{t-1}^2 + u_{t-1}\varepsilon_t ]\\
cov(u_t, u_{t-1}) &= E(u_t u_{t-1}) = E[\rho u_{t-1}^2 + u_{t-1}\varepsilon_t ]\\
&=\rho \frac{\sigma_t^2}{1-\rho^2}\tag*{$\because Var(u_t) = \frac{\sigma_t^2}{1-\rho^2}$}
\end{align*}
Similarly,
\begin{align*}
cov(u_t,u_{t-2}) &=\rho^2 \frac{\sigma_t^2}{(1-\rho^2)}\\
cov(u_t,u_{t-2}) &= \rho^2 \frac{\sigma_t^2}{(1-\rho^2)}\\
cov(u_t, u_{t+s}) &= \rho^p
\end{align*}
The strength and direction of the correlation (positive or negative) and its distance from zero determine the significance of the first-order autocorrelation. Values close to $+1$ or $-1$ indicate strong positive or negative autocorrelation, respectively. A value close to zero suggests little to no autocorrelation.
Software like R, Python, and MS Excel have built-in functions to calculate autocorrelation. The visualization of ACF is often a preferred method to assess autocorrelation across different lags, not just the first order autocorrelation.
In summary, first order autocorrelation refers to the correlation between a time series and lagged values of the same time series, specifically at a lag of one time period. It measures how much a variable in a time series is related to its immediate past value.