# Basic Statistics and Data Analysis

## Random Walk Model

The random walk model is widely used in the area of finance. The stock prices or exchange rates (Asset prices) follow a random walk. A common and serious departure from random behavior is called a random walk (non-stationary), since today’s stock price is equal to yesterday stock price plus a random shock.

There are two types of random walks

1. Random walk without drift (no constant or intercept)
2. Random walk with drift (with a constant term)

Definition

A time series said to follow a random walk if the first differences (difference from one observation to the next observation) are random.

Note that in a random walk model, the time series itself is not random, however, the first differences of time series are random (the differences changes from one period to the next).

A random walk model for a time series $X_t$ can be written as

$X_t=X_{t-1}+e_t\, \, ,$

where $X_t$ is the value in time period $t$, $X_{t-1}$ is the value in time period $t-1$ plus a random shock $e_t$ (value of error term in time period $t$).

Since the random walk is defined in terms of first differences, therefore, it is easier to see the model as

$X_t-X_{t-1}=e_t\, \, ,$

where the original time series is changed to a first difference time series, that is the time series is transformed.

The transformed time series:

• Forecast the future trends to aid in decision making
• If time series follows random walk, the original series offers little or no insights
• May need to analyze first differenced time series

Consider a real-world example of daily US-dollar-to-Euro exchange rate. A plot of entire history (of daily US-dollar-to-Euro exchange rate) from January 1, 1999, to December 5, 2014 looks like

The historical pattern from above plot looks quite interesting, with many peaks and valleys. The plot of the daily changes (first difference) would look like

The volatility (variance) has not been constant over time, but the day-to-day changes are almost completely random.

Remember that, random walk patterns are also widely found elsewhere in nature, for example, in the phenomenon of Brownian Motion that was first explained by Einstein.

# Random Walk Probability of Returning to Origin after n Steps

Assume that the walk starts at x=0 with steps to the right or left occurring with probabilities p and q=1-p. We can write the position $X_n$ after n steps as
$X_n=R_n-L_n \tag{1}$
where $R_n$ is the number of right or positive steps (+1) and $L_n$ is the number of left or negative steps (-1).

Therefore the Total steps can be calculated as:  $n=R_n+L_n \tag{2}$
Hence
\begin{align*}
L_n&=n-R_n\\
\Rightarrow X_n&=R_n-n+R_n\\
R_n&=\frac{1}{2}(n+X_n) \tag{3}
\end{align*}
The equation (3) will be an integer only when n and $X_n$ are both even or both odd (eg. To go from x=0 to x=7 we must take an odd number of steps).

Now, let $v_{n,x}$ be the probability that the walk is at state x after n steps assuming that x is a positive integer. Then
\begin{align*}
v_{n,x}&=P(X_n=x)\\
&=P(R_n=\frac{1}{2}(n+x))
\end{align*}
$R_n$ is a binomial random variable with index $n$ having probability p, since the walker either moves to the right or not at every step, and the steps are independent, then
\begin{align*}
v_{n,x}&=\binom{n}{\frac{1}{2}(n+x)}p^{\frac{1}{2}(n+x)}q^{n-\frac{1}{2}(n+x)}\\
&=\binom{n}{\frac{1}{2}(n+x)}p^{\frac{1}{2}(n+x)}q^{\frac{1}{2}(n-x)} \tag{4}
\end{align*}
where (n,x) are both even or both odd and $-n \leq x \leq n$. Note that a similar argument can be constructed if x is a negative integer.

Example: For total number of steps is 2, the net displacement must be one of the three possibilities: (1) two steps to the left, (2) back to the start, (3) or two steps to the right. These correspond to values of x = -2, 0,+2. Clearly it is impossible to get more than two units away from the origin if you take only two steps and it is equally impossible to end up exactly one unit from the origin if you take two steps.

For symmetric case ($p=\tfrac{1}{2}$), starting from the origin, there are $2^n$ different paths of length n since there is a choice of right or left move at each step. Since the number of steps in the right direction must be $\tfrac{1}{2}(n+x)$ and the total number of paths must be the number of ways in which $\frac{1}{2}(n+x)$ can be chosen from n: that is
$N_{n,x}=\binom{n}{\tfrac{1}{2}(n+x)}$
provided that $\tfrac{1}{2}(n+x)$ is an integer.

By counting rule, the probability that the walk ends at x after n steps is given by the ratio of this number and the total number of paths (since all paths are equally likely). Hence
$v_{n,x}=\frac{N_{n,x}}{2^n}=\binom{n}{\tfrac{1}{2}(n+x)}\frac{1}{2^n}$
The probability $v_{n,x}$ is the probability that the walk ends at state x after n steps: the walk could have overshot x before returning there.

## Related Probability/ First Passage through x

A related probability is the probability that the first visit to position x occurs at the nth step. The following is descriptive derivation of the associated probability generating function of the symmetric random walk in which the walk starts at the origin, and we consider the probability that it returns to the origin.

From equation (4), the probability that a walk is at the origin at step n is
\begin{align*}
v_{n,x}&=\binom{n}{\frac{1}{2}(n+x)}p^{\frac{1}{2}(n+x)}q^{n-\frac{1}{2}(n+x)}\\
&=\binom{n}{\tfrac{1}{2}(n+0)} \left(\frac{1}{2}\right)^{\tfrac{1}{2}n} \left(\frac{1}{2}\right)^{\tfrac{1}{2}n}\\
\end{align*}
Here $p_n$ is the probability that after n steps the position of walker is at origin. We also assume that $p_n=0$ if n is odd. From equation (5) we can construct a generating function.
\begin{align*}
H(s)&=\sum_{n=0}^\infty p_n s^n\\
&=\sum_{n=0}^\infty p_{2n}s^{2n}=\sum_{n=0}^\infty \frac{1}{2^{2n}}\binom{2n}{n}s^{2n} \tag{6}
\end{align*}
Note that $p_0=1$, and H(s) is not a probability generating function since $H(1)\neq1$.

The binomial coefficient can be re-arranged as follows:
\begin{align*}
\binom{2n}{n}&=\frac{(2n)!}{n!n!}=\frac{2n(2n-1)(2n-2)\cdots3.2.1}{n!n!}\\
&=\frac{2^nn!(2n-1)(2n-3)\cdots3.2.1}{n!n!}\\
&=\frac{2^{2n}}{n!}\frac{1}{2}\frac{3}{2}\cdots(n-\tfrac{1}{2})\\
&=(-1)^n \binom{-\tfrac{1}{2}}{n}2^{2n} \tag{7}
\end{align*}
Using equation (6) in (7)
$H(s)=\sum_{n=0}^\infty \frac{1}{2^{2n}}(-1)^n \binom{-\tfrac{1}{2}}{n}s^{2n}2^{2n}=(1-s^2)^{-\tfrac{1}{2}} \tag{8}$
by binomial theorem, provided $|s|<1$. Note that this expansion guarantees that $p_n=0$ if n is odd.

Note that the equation (8) does not sum to one. This is called defective distribution which still gives the probability that the walk is at the origin at step n.

We can estimate the behaviour of $p_n$ for large n by using Stirling’s Formula (asymptotic estimate for n! for large n), $n!\approx\sqrt{2\pi} n^{n+\tfrac{1}{2}}e^{-n}$

From equation (5)
\begin{align*}
p_{2n}&=\frac{1}{2^{2n}}\binom{2n}{n}=\frac{1}{2^{2n}}\frac{(2n)!}{n!n!}\\
&\approx\frac{1}{2^{2n}}\frac{\sqrt{2\pi}(2n)^{2n+\tfrac{1}{2}}e^{-2n}}{[\sqrt{2\pi}(n^{n+\tfrac{1}{2}}e^{-n})]^2}\\
&=\frac{1}{\sqrt{\pi n}}; \qquad \text{for large $n$}
\end{align*}
Hence $np_n\rightarrow \infty$ confirming that the series $\sum\limits_{n=0}^\infty p_n$ must diverge.

Random Walk Probability of Returning to Origin after n Steps some EXAMPLES

Example: Consider a random walk starts from x_0=0 find the probability that after 5 steps the position is 3. i.e. $X_5=3$, p=0.6.

Solution: Here number of steps are n=5 and position is x=3. Therefore positive and negative steps are

$R_n= \frac{1}{2}(n+x)=\frac{1}{2}(5+3)=4$ and $X_n=R_n-L_n \Rightarrow 3=4+L_n=1$
The probability that the event $X_5=3$ will occur in a random walk with $p=0.6$ is
$P(X_5=3)=\binom{5}{\frac{1}{2}(5+3)}(0.6)^{\tfrac{1}{2}(3+5)}(0.4)^{\tfrac{1}{2}(5-3)}=0.2592$