Statistics for Data Science & Analytics - Learn Statistics: MCQs, Software & Data Analysi

Objectives of Time Series Analysis (2014)

Jul 13, 2024Apr 9, 2014 by Muhammad Imdad Ullah

Post Views: 1,036

There are many objectives of time series analysis. The one of major Objectives of Time Series is to identify the underlying structure of the Time Series represented by a sequence of observations by breaking it down into its components (Secular Trend, Seasonal Variation, Cyclical Trend, Irregular Variation).

Objectives of Time Series Analysis

The objectives of Time Series Analysis are classified as follows:

Description
Explanation
Prediction
Control

The description of the objectives of time series analysis is as follows:

Description of Time Series Analysis

The first step in the analysis is to plot the data and obtain simple descriptive measures (such as plotting data, looking for trends, seasonal fluctuations, and so on) of the main properties of the series. In the above figure, there is a regular seasonal pattern of price change although this price pattern is not consistent. The Graph enables us to look for “wild” observations or outliers (not appear to be consistent with the rest of the data). Graphing the time series makes possible the presence of turning points where the upward trend suddenly changed to a downward trend. If there is a turning point, different models may have to be fitted to the two parts of the series.

Explanation

Observations were taken on two or more variables, making it possible to use the variation in a one-time series to explain the variation in another series. This may lead to a deeper understanding. A multiple regression model may be helpful in this case.

Prediction

Given an observed time series, one may want to predict the future values of the series. It is an important task in sales forecasting and is the analysis of economic and industrial time series. Prediction and forecasting are used interchangeably.

Control

When time series is generated to measure the quality of a manufacturing process (the aim may be) to control the process. Control procedures are of several different kinds. In quality control, the observations are plotted on a control chart and the controller takes action as a result of studying the charts. A stochastic model is fitted to the series. Future values of the series are predicted and then the input process variables are adjusted to keep the process on target.

Objectives of Time Series Analysis seasonal-effects — Image taken from: http://archive.stats.govt.nz

The figure shows that there is a regular seasonal pattern of price change although this price pattern is not consistent.

In quality control, the observations are plotted on the control chart and the controller takes action as a result of studying the charts.

A stochastic model is fitted to the series. Future values of the series are predicted and then the input process variables are adjusted to keep the process on target.

Learn more about Time Series on Wikipedia

Learn R Programming

Primary and Secondary Data (2014)

May 2, 2024Mar 22, 2014 by Muhammad Imdad Ullah

Post Views: 708

Data

Before learning about primary and Secondary Data, let us first understand the term Data in Statistics.

The facts and figures which can be numerically measured are studied in statistics. Numerical measures of the same characteristics are known as observation and collection of observations is termed as data. Data are collected by individual research workers or by organizations through sample surveys or experiments, keeping in view the objectives of the study. The data collected may be (i) Primary Data and (ii) Secondary Data.

Primary and Secondary Data in Statistics

The difference between primary and secondary data in Statistics is that Primary data is collected firsthand by a researcher (organization, person, authority, agency or party, etc.) through experiments, surveys, questionnaires, focus groups, conducting interviews, and taking (required) measurements, while the secondary data is readily available (collected by someone else) and is available to the public through publications, journals, and newspapers.

Primary Data

Primary data means the raw data (data without fabrication or not tailored data) that has just been collected from the source and has not gone through any kind of statistical treatment like sorting and tabulation. The term primary data may sometimes be used to refer to first-hand information.

Sources of Primary Data

The sources of primary data are primary units such as basic experimental units, individuals, and households. The following methods are used to collect data from primary units usually and these methods depend on the nature of the primary unit. Published data and the data collected in the past are called secondary data.

Personal Investigation
The researcher experiments or surveys himself/herself and collects data from it. The collected data is generally accurate and reliable. This method of collecting primary data is feasible only in the case of small-scale laboratories, field experiments, or pilot surveys and is not practicable for large-scale experiments and surveys because it takes too much time.
Through Investigators
The trained (experienced) investigators are employed to collect the required data. In the case of surveys, they contact the individuals and fill in the questionnaires after asking for the required information, whereas a questionnaire is an inquiry form having many questions designed to obtain information from the respondents. This method of collecting data is usually employed by most organizations and it gives reasonably accurate information but it is very costly and may be time-consuming too.
Through Questionnaire
The required information (data) is obtained by sending a questionnaire (printed or soft form) to the selected individuals (respondents) (by mail) who fill in the questionnaire and return it to the investigator. This method is relatively cheap as compared to the “through investigator” method but the non-response rate is very high as most of the respondents don’t bother to fill in the questionnaire and send it back to the investigator.
Through Local Sources
The local representatives or agents are asked to send requisite information and provide the information based on their own experience. This method is quick but it gives rough estimates only.
Through Telephone
The information may be obtained by contacting the individuals by telephone. It is Quick and provides the accurate required information.
Through Internet
With the introduction of information technology, people may be contacted through the Internet and individuals may be asked to provide pertinent information. Google Survey is widely used as an online method for data collection nowadays. There are many paid online survey services too.

It is important to go through the primary data and locate any inconsistent observations before it is given a statistical treatment.

Secondary Data

Data that has already been collected by someone, may be sorted, tabulated, and has undergone a statistical treatment. It is fabricated or tailored data.

Sources of Secondary Data

The secondary data may be available from the following sources:

Government Organizations
Federal and Provincial Bureau of Statistics, Crop Reporting Service-Agriculture Department, Census and Registration Organization etc.
Semi-Government Organization
Municipal committees, District Councils, Commercial and Financial Institutions like banks etc
Teaching and Research Organizations
Research Journals and Newspapers
Internet

Data Structure in R Language

Markov Chain an Introduction (2014)

May 8, 2024Mar 2, 2014 by Muhammad Imdad Ullah

Post Views: 851

A Markov chain, named after Andrey Markov is a mathematical system that experiences transitions from one state to another, between a finite or countable number of possible states. Markov chain is a random process usually characterized as memoryless: the next state depends only on the current state and not on the sequence of events that preceded it. This specific kind of memorylessness is called the Markov property. Markov chains have many applications as statistical models of real-world processes.

If the random variables $X_{n-1}$ and $X_n$ take the values $X_{n-1}=i$ and $X_n=j$, then the system has made a transition $S_i \rightarrow S_j$, that is, a transition from state $S_i$ to state $S_j$ at the $n$th trial. Note that $i$ can equal $j$, so transitions within the same state may be possible. We need to assign probabilities to the transitions $S_i \rightarrow S_j$. Generally in the chain, the probability that $X_n=j$ will depend on the whole sequence of random variables starting with the initial value $X_0$.

The Markov chain has the characteristic property that the probability that $X_n=j$ depends only on the immediately previous state of the system. This means that we need no further information at each step other than for each $i$ and $j$, \[P\{X_n=j|X_{n-1}=i\}\]
which means the probability that $X_n=j$ given that $X_{n-1}=i$: this probability is independent of the values of $X_{n-2},X_{n-3},\cdots, X_0$.

Let us have a set of states $S=\{s_1,s_2,\cdots,s_n\}$. The process starts in one of these states and moves successively from one state to another. Each move is called a step. If the chain is currently in state $s_i$ then it moves to state $s_j$ at the next step with a probability denoted by $p_{ij}$ (transition probability) and this probability does not depend upon which states the chain was in before the current state. The probabilities $p_{ij}$ are called transition probabilities ($s_i \xrightarrow[]{p_{ij}} s_j$ ). The process can remain in its state, and this occurs in probability $p_{ii}$.

An initial probability distribution, defined on $S$ specifies the starting state. Usually, this is done by specifying a particular state as the starting state.

A Markov chain is a sequence of random variables $X_1, X_2,\cdots,$ with the Markov property that, given the present state, the future and past state are independent. Thus
\[P(X_n=x|X_1=x_1,X_2=x_2\cdots X_{n-1}=x_{n-1})\]
\[\quad=P(X_n=x|X_{n-1}=x_{n-1})\]
Or
\[P(X_n=j|X_{n-1}=i)\]

Example: Markov Chain

A Markov chain $X$ on $S=\{0,1\}$ is determined by the initial distribution given by $p_0=P(X_0=0), \; p_1=P(X_0=1)$ and the one-step transition probability given by $p_{00}=P(x_{n+1}=0|X_n=0)$, $p_{10}=P(x_{n+1}=0|X_n=1)$, $p_{01}=1-p_{00}$ and $p_{11}=1-p_{10}$, so one-step transition probability in matrix form is $P=\begin{pmatrix}p_{00}&p_{10}\\p_{01}&p_{11}\end{pmatrix}$

Markov chains are a powerful tool for modeling various random processes. However, it’s important to remember that they assume the Markov property, which may not always hold true in real-world scenarios.

Applications of Markov Chains

Information Theory: Used in data compression algorithms like Huffman coding.
Search Algorithms: Applied in recommender systems and website navigation analysis.
Queueing Theory: Helps model customer arrivals and service times in queues.
Financial Modeling: Financial analysts can use Markov chains to model stock prices or economic trends.
Game Design: Markov chains can be used to create video games with more realistic and interesting behavior for non-player characters.
Predictive Text: Smartphones that suggest the next word you are typing use a kind of Markov chain, where the probability of the next word depends on the current word.
Modeling weather: Markov chains can be used to represent the probabilities of transitioning between different weather states.

References

https://en.wikipedia.org/wiki/Markov_chain
http://people.virginia.edu/~rlc9s/sys6005/SYS_6005_Intro_to_MC.pdf
http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf

Compu ter MCQs Online Test

Learn R Programming

Table of Contents

Objectives of Time Series Analysis

Description of Time Series Analysis

Explanation

Prediction

Control

Share this:

Data

Primary and Secondary Data in Statistics

Primary Data

Sources of Primary Data

Secondary Data

Sources of Secondary Data

Share this:

Example: Markov Chain

Applications of Markov Chains

References

Share this: