Random Walks Model: A Mathematical Formalization of Path

A random walk (first introduced by Karl Pearson in 1905) is a mathematical formalization of a path consisting of a series of random steps.

Random Walks Example

The following are some example related to random walks

  1. The path traced by a molecule as it travels in a liquid or gas,
  2. The search path of a foraging animal,
  3. The price of a fluctuating stock, and (iv) the financial status of a gambler.
    All these random steps in the example can be modeled as random walks, although they may not be truly random in reality.

Suppose there are $a+1$ positions marked out on a straight line and numbered 0,1,2,…, a. A person starts at $k$ where $0<k<a$. The walk proceeds in such a way that, at each step, there is probability p that the walker goes forward one step to $k+1$ and a probability $q=1-p$ that the walker goes back one step to $k-1$. The walk continues until either $0$ or $a$ is reached and then ends.

In a random walk, the position of a walker after having moved $n$ times is known as the state of the walk after $n$ steps or after covering $n$ stages. Thus the walk described above starts at stage $k$ at step $0$ and moves to either stage $k-1$ or stage $k+1$ after 1 step and so on.

If the walk is bounded, then the ends of the walk are known as barriers and they may have various properties. In this case, the barriers are said to be absorbing implying that the walk must end once a barrier is reached since there is no escape.

A useful diagrammatic way of representing random walk is by a transition or process diagram. In a transition diagram, the possible states of the walker can be represented by points on a line. If a transition between two points can occur in one step then those points are joined by a curve or edge as shown with an arrow indicating the direction of the walk and a weighting denoting the probability of the step occurring. A transition diagram is also known as a direct graph.

For small Markov processes the simplest way to represent the process is often in terms of its state transition diagram. In-state transition diagram each state (outcome) represents the process as a node in a graph. The arcs in the graph represent possible transitions between states of the process. The arcs are labeled by the transition rates between the states.

Example:  Suppose a meteorologist notices that the weather on a given day seems to depend on the weather conditions of the previous day. He/ She observes that if it is raining one day, then the next day is sunny 60% of the time and rainy 40% of the time; on the other hand, if it is sunny, the next day is sunny with probability 30% and rainy with probability 70%. Note that there are two outcomes (i) sunny and (ii) rainy in this Markov process.

The transition probability between sunny and rainy is 70%, between sunny and sunny is 30%, between rainy and sunny is 60%, and between rainy and rainy is 40%. The simple weather forecasting Markov Process in the transition diagram is

Random Walks https://itfeature.com
Random Walks

Random walk models are widely used in many fields such as Ecology, Economics, Psychology, Computer Science, Physics, Chemistry, Biology, etc. Random walks explain the observed behavior of processes in all these fields, serving as a fundamental model for the recorded stochastic activity.

Overall, the random walk model is a versatile tool within stochastic processes. It provides a framework for studying systems influenced by randomness and helps understand the evolution of such systems over time.

https://itfeature.com

Learning Statistics by Using R Programming Language

Visit: Quiz website

Stochastic Processes Introduction (2012)

Before starting the introduction of Stochastic Processes, let us start with some important definitions related to statistics and stochastic processes.

Important Terms and Definitions

Experiment: Any activity or situation having an uncertain outcome.

Sample Space:  The set of all possible outcomes is called sample space and every element $\omega$ of $\Omega$ is called sample point. In the Stochastic process, we will call it state space.

Event and Event Space:  An event is a subset of the sample space. The class of all events associated with a given experiment is defined to be the event space.

An event will always be a subset of the sample space, but for sufficiently large sample spaces, not all subsets will be events. Thus the class of all subsets of the sample space will not necessarily correspond to the event space.

Random Variable: A random variable is a mapping function that assigns outcomes of a random experiment to real numbers. The occurrence of the outcome follows a certain probability distribution. Therefore, a random variable is completely characterized by its probability density function (PDF). Or

A random variable is a map $X:\Omega \rightarrow R$ such that $F\{X \le x\} = \{\omega \in \Omega:x(\omega)\le x\} \in F$ for all $x \in R$.

Probability Space:  A probability space consists of $(\Omega, \mathfrak{F}, P)$ of three parts, sample space, a collection of events, and a probability measure.

Cumulative Distribution Function (CDF): Probability distribution function for the random variable $X$ such that $F(a) = P\{X \le a\}$.

time line

Time: A point of time either discrete or continuous

State: It describes the attribute of a system at some point in time $S=(s_1, s_2, \cdots, s_t)$.

It is convenient to assign some unique non-negative integer as an index to each possible value of the state vector $S$.

Activity: Something that takes some amount of time (duration) to occur. The activity culminates in an event.

Transition (movement from one state to another) Stochastic Processes

Transition:  Transition is caused by an event and it results in some movement from one state to another state.

Probability Measure:  A probability measure intends to be a function defined for all subsets of $\Omega$.

What is a Stochastic Process?

The word stochastic is derived from the Greek word “stoΩ’kæstIk” meaning “to aim at a target”. Stochastic processes involve a state which changes randomly.
Given a probability space $(\Omega, \mathfrak{F}, P)$  stochastic process $\{X(t), t\in T\}$ is a family of random variables, where the index set $T may be discrete $(T=\{0,1,2,\cdots,\})$ or continuous $(T=[0, \infty))$. The set of possible values which random variables $\{X(t), t\in T\}$ may assume is called the state space of the process, and denoted by $S$.

A continuous time stochastic process $\{X(t), t \in T\}; (T=[0, \infty))$ is said to have an independent increment of for all choices of $\{t_1,t_2, \cdots, t_n\}$, the $n$ random variables $X(t_1) – X(t_0), X(t_2) – X(t_1), \cdots, X(t_n)-X(t_{n-1})$ are independent. Using discrete time the state of the process at time $n+1$ depends only on its state at time $n$.

It is often used to represent the evolution of some random value or system over time.

Examples of Stochastic Processes

Examples of processes modeled as stochastic time series include stock market and exchange rate fluctuations, signals such as speech, audio, and video, medical data such as a patient’s EKG, EEG, blood pressure or temperature, random movement such as Brownian motion or random walks, counting process, Renewal process, Poisson process and Markov process.

A stochastic process is a collection of random variables that evolve over time (or some other index). Stochastic processes are powerful tools for modeling real-world systems that exhibit randomness. They are used in a wide range of fields, including finance, engineering, physics, and even biology.

Introduction to the Random Walks Model

Generate Random Numbers in R Language

Online MCQs about Computer Science and Information Technology

Standard Normal Table (2012)

A standard normal table, also called the unit normal table or Z-table, is a table for the values of Φ calculated mathematically, and these are the values from the cumulative normal distribution function. A standard normal distribution table is used to find the probability that a statistic is observed below, above, or between values on the standard normal distribution, and by extension, any normal distribution. Since probability tables cannot be printed for every normal distribution, as there is an infinite variety (families) of normal distributions, it is common practice to convert a normal to a standard normal and then use the standard normal table to find the required probabilities (area under the normal curve).

The standard normal curve is symmetrical, so the table can be used for values going in any direction, for example, a negative 0.45 or positive 0.45 has an area of 0.1736.

The Standard Normal distribution is used in various hypothesis testing procedures such as tests on single means, the difference between two means, and tests on proportions. The Standard Normal distribution has a mean of 0 and a standard deviation of 1.

The values inside the given table represent the areas under the standard normal curve for values between 0 and the relative z-score.

The table value for $$Z is 1 minus the value of the cumulative normal distribution.

Standard Normal Table (Area Under the Normal Curve)

Standard Normal Table

For example, the value for 1.96 is $P(Z>1.96) = 0.0250$.

Standard Normal Table (Summary)

  • A table of values for the cumulative distribution function (CDF) of the standard normal distribution.
  • The standard normal distribution has a mean of 0 and a standard deviation of 1.
  • This table shows the probability that a standard normal variable will be less than a certain value (z-score).
https://itfeature.com

FAQs about Standard Normal Table

  1. What is a standard normal distribution table?
  2. What is the value of mean and variance in standard normal distribution?
  3. What is the cumulative distribution function of standard normal distribution?
  4. What kind of values are in the standard normal distribution table?
  5. Is the standard normal distribution curve symmetrical?
  6. What is meant by the area under the normal curve?
  7. What is the use of standard normal distribution?
  8. The values of $Z$ inside the standard normal table range from 0 to what value?

For further details see Standard Normal

See about the measure of asymmetry

Probability in R Language

Histogram Graph: Useful Graphical Representation of Data

A histogram is very similar to the bar chart for a frequency distribution based on quantitative data showing the distribution of qualitative data. It is a useful graphical representation of data that helps to visualize the distribution of data.

Important Points to Draw a Histogram Graph

The histogram is constructed from the grouped data by taking the class boundaries (not class limits) along the x-axis and the corresponding frequencies along the y-axis. For ungrouped data, we have to form the grouped frequency distribution before making a histogram. It consists of a set of bars (like a bar chart) but these bars are adjacent to each other and the height of the bars is proportional to the frequency associated with respective classes.

The area of each rectangle represented the respective class frequencies. When the class intervals are equal, the rectangles all have the same width and their heights directly represent the class frequencies. For the case in which class intervals are not all equal, the height of the rectangle (bar) over an unequal class interval, is to be adjusted because it is area and not the height that measures frequency. This means that the height of a rectangle must be proportionally decreased if the length of the corresponding class interval increases.

For example, if the length of a class interval becomes double, then the height of the rectangle is to be halved so that area, being the fundamental property of the rectangle of the histogram remains unchanged. This sort of rescaling is necessary to observe the correct pattern of distribution.

Important Features of Histogram

The important feature of the Histogram graph is that there is no gap (space) between the vertical bars because the variable plotted on the horizontal axis is quantitative and the variable is from the measure of scale either interval or ratio. Thus, it provides an easily interpreted visual representation of a frequency distribution. Note that class midpoints are used as labels for the classes.

It allows us to analyze extremely large datasets by reducing them to a single graphical representation which is used to show primary, secondary, and tertiary peaks in data, and also helps us by giving a visual representation of the statistical significance of those peaks.

Alternative of Histogram

An alternative to the histogram is kernel density estimation, which uses a kernel to smooth samples. This will construct a smooth probability density function, which will, in general, more accurately reflect the underlying variable.

Histograms for Continuous Grouped Data

To draw a histogram graph from the continuous grouped frequency distribution, the following steps are taken.

  1. Mark the class boundaries of the classes along the x-axis.
  2. Mark frequencies along the y-axis.
  3. Draw a rectangle for each class such that the height of each rectangle is proportional to the frequency corresponding to that class. This is the case when classes are of equal width as they often are.
  4. If the classes are of unequal width, then the area instead of the height of each rectangle is proportional to the frequency corresponding to that class, and the height of each rectangle is obtained by dividing the frequency of the class by the width of that class.

It may be noted that the area under a histogram graph can be calculated by adding up the areas of all the rectangles that constitute the histogram. The area of one rectangle is obtained by the multiplication of the width of the class by the corresponding frequency i.e.

Area of a single rectangle = width of the class x frequency of the class

Histogram for Discrete Data

Bar graphs are usually drawn for discrete and categorical data but there are some situations where there is a need to make an approximation, the histograms may be constructed. To construct a histogram graph for discrete grouped data, the following steps are taken:

  1. Mark possible values on the x-axis.
  2. Mark frequencies along the y-axis.
  3. Draw a rectangle centered on each value with equal width on each side possibly 0.5 to either side of the value.
Histogram graph

The advantages of the histograms as compared to the unprocessed data are:

  1. It gives a range of data.
  2. It gives the location of the data.
  3. it gives a clue about the skewness of the data.
  4. It gives information about the out-of-control situation.
  5. Histograms are density estimates (give a good impression of the distribution of data.
  6. Can be compared to the normal curve.

The disadvantages are:

  • Exact values cannot be read from histogram graph because data is grouped into categories and individuality of data vanishes in grouped data.
  • It is more difficult to compare two data sets.
  • It is used only for the continuous data set.

FAQs about Histogram

  1. What is a histogram graph?
  2. What is the difference between a bar chart and a histogram?
  3. What are the important features of histograms?
  4. What are the advantages and disadvantages of histogram graphs?
  5. How one can draw a histogram for a discrete data set?
  6. How one can draw a histogram for a continuous data set?

Graphical Representation of Data, Data Visualization/ Graphics in R