# Basic Statistics and Data Analysis

## Random Walk Model

The random walk model is widely used in the area of finance. The stock prices or exchange rates (Asset prices) follow a random walk. A common and serious departure from random behavior is called a random walk (non-stationary), since today’s stock price is equal to yesterday stock price plus a random shock.

There are two types of random walks

1. Random walk without drift (no constant or intercept)
2. Random walk with drift (with a constant term)

Definition

A time series said to follow a random walk if the first differences (difference from one observation to the next observation) are random.

Note that in a random walk model, the time series itself is not random, however, the first differences of time series are random (the differences changes from one period to the next).

A random walk model for a time series $X_t$ can be written as

$X_t=X_{t-1}+e_t\, \, ,$

where $X_t$ is the value in time period $t$, $X_{t-1}$ is the value in time period $t-1$ plus a random shock $e_t$ (value of error term in time period $t$).

Since the random walk is defined in terms of first differences, therefore, it is easier to see the model as

$X_t-X_{t-1}=e_t\, \, ,$

where the original time series is changed to a first difference time series, that is the time series is transformed.

The transformed time series:

• Forecast the future trends to aid in decision making
• If time series follows random walk, the original series offers little or no insights
• May need to analyze first differenced time series

Consider a real-world example of daily US-dollar-to-Euro exchange rate. A plot of entire history (of daily US-dollar-to-Euro exchange rate) from January 1, 1999, to December 5, 2014 looks like

The historical pattern from above plot looks quite interesting, with many peaks and valleys. The plot of the daily changes (first difference) would look like

The volatility (variance) has not been constant over time, but the day-to-day changes are almost completely random.

Remember that, random walk patterns are also widely found elsewhere in nature, for example, in the phenomenon of Brownian Motion that was first explained by Einstein.

# Symmetric Random Walk Probability

As we know that a related probability is the one for the event in which the first visit to position x occurs at the nth step given that the walk starts at the origin, i.e. the first passage through x. For case x=0 we will find the probability of the first return to the origin. Let A be the event that $X_n=0$, let $B_k$ be the event that the first visit to the origin occurs at the kth step. By the law of total probability
$P(A)=\sum_{k=1}^n P(A|B_k)P(B_k) \tag{1}\label{eqn2.341}$
Let $f_k=P(B_k)$. The conditional probability $P(A|B_k)$ is the probability that the walk returns to the origin after n–k steps i.e. $P(A|B_k)=p_{n-k}$. Note that $p_n$ is the probability that the walk is at the origin at step n and $p_n=0$ if n is odd. Hence (\ref{eqn2.341}) can be written as
$p_n=\sum_{k=1}^n p_{n-k}f_k \tag{2}\label{eqn2.342}$
Multiplying both sides of (\ref{eqn2.342}) by $s^n$ and summing for $n\geq1$, we have
$\sum_{n=1}^\infty p_ns^n =\sum_{n=1}^\infty \sum_{k=1}^n p_{n-k}f_k s^n \tag{3}\label{eqn2.343}$
as
$H(s)=\sum_{n=0}^\infty p_ns^n=1+\sum_{n=1}^\infty p_ns^n$
$H(s)-1=\sum_{n=1}^\infty p_ns^n\qquad$
Therefore from (\ref{eqn2.343})
\begin{align*}
H(s)-1&=\sum_{n=1}^\infty p_n s^n\\
&=\sum_{n=1}^\infty \sum_{k=1}^n p_{n-k}f_ks^n\tag{4}\label{eqn2.344}
\end{align*}
Since $p_0=1$ and $f_0=0$, we can replace (\ref{eqn2.344}) by
\begin{align*}
H(s)-1&=\sum_{n=1}^\infty p_ns^n\\
&=\sum_{n=0}^\infty \sum_{k=0}^n p_{n-k}f_ks^n = \sum_{n=0}^\infty p_ns^n \sum_{k=0}^\infty f_k s^k \tag{5}\label{eqn2.345}
\end{align*}
the product of two power series, also known as Convolution of the distribution.

Now if $Q(s)=\sum_{n=0}^\infty f_ns^n$ is the probability generating function of the first return distribution, then
$H(s)-1=H(s)Q(s)$.

From equation (8) of lecture “Probability Distribution after n steps” we have
\begin{align*}
Q(s)&=\frac{H(s)-1}{H(s)}=1-\frac{1}{H(s)}\\
&=1-(1-s^2)^{\frac{1}{2}} \tag{6}\label{eqn2.346}
\end{align*}
The probability that the walk will, at some step, return to the origin is $\sum_{n=1}^\infty f_n=Q(1)=1$;
i.e. the return is certain. In this walk the origin or any point by translation is persistent. However, the mean number of steps until this return occurs is
\begin{align*}
\sum_{n=1}^\infty nf_n &=\lim_{s \rightarrow 1-} Q'(s)\\
&=\lim_{s \rightarrow 1-} \frac{s}{(1-s^2)^{\frac{1}{2}}}=\infty
\end{align*}
In this walk which starts at the origin is certain to return there in the future, but on average, it will take an infinite number of steps.

Example:  Find the probability that a symmetric random walk starting from the origin returns there for the first time after 6 steps.

solution:  We require the coefficient of $s^6$ in the power series expansion of the pgf Q(s), which is
\begin{align*}
Q(s)&=1-(1-s^2)^{\frac{1}{2}}\\
&=1-\left[\frac{1^{\frac{1}{2}}}{0!}-\frac{\frac{1}{2}{(-s^2)}^1}{1!}-\frac{\frac{1}{2}(\frac{1}{2}-1){(-s^2)}^2}{2!}-\frac{\frac{1}{2}(\frac{1}{2}-1)(\frac{1}{2}-2){(-s^2)}^3}{3!}\right]\\
&=1-1+\frac{1}{2}s^2+\frac{1}{8}s^4+\frac{1}{16}s^6+0s^8
\end{align*}
Hence the probability of a first return at step 6 is $\frac{1}{16}$.