Time series Forecasting - Statistics for Data Science & Analytics

Time series analysis is the analysis of a series of data points over time, allowing one to answer questions such as what is the causal effect on a variable $Y$ of a change in variable $X$ over time? An important difference between time series and cross-sectional data is that the ordering of cases does matter in time series.

Time Series

A time series $\{Y_t\}$ or $\{y_1,y_2,\cdots,y_T\}$ is a discrete-time, continuous state process where time $t=1,2,\cdots,=T$ are certain discrete time points spaced at uniform time intervals.

Usually, time is taken at more or less equally spaced intervals such as hours, days, months, quarters, or years. More specifically, it is a set of data in which observations are arranged in chronological order (A set of repeated observations of the same variable).

Use of Time Series

Time series are used in different fields of science, such as statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, and communications engineering, among many other fields.

Definition: A sequence of random variables indexed by time is called a stochastic process (stochastic means random) or time series for mere mortals. A data set is one possible outcome (realization) of the stochastic process. If history had been different, we would observe a different outcome; thus, we can think of time series as the outcome of a random variable.

Rather than dealing with individuals as units, the unit of interest is time: the value of Y at time $t$ is $Y_t$. The unit of time can be anything from days to election years. The value of $Y_t$ in the previous period is called the first lag value: $Y_{t-1}$. The jth lag is denoted: $Y_{t-j}$. Similarly, $Y_{t+1}$ is the value of $Y_t$ in the next period. So a simple bivariate regression equation for time series data looks like: \[Y_t = \beta_0 + \beta X_t + u_t\]

Continuous Time Series

A time series is said to be continuous when observations are made continuously in time. The term continuous is used for a series of this type even when the measured variable can only take a discrete set of values.

Discrete Time Series

A time series is said to be discrete when observations are taken at a specific time, usually equally spaced. The term discrete is used for a series of this type even when the measured variable is a continuous variable.

Most Macroeconomic and financial data come in the form of time series. GNP or Stock Return is an example of time series data.

We can write a series as $\{x_1,x_2,x_3,\cdots,x_T\}$ or $\{x_t\}$, where $t=1,2,3,\cdots,T$. $x_t$ is treated as a random variable.

Time Series Analysis

Time series analysis refers to the branch of statistics where observations are collected sequentially in time, usually but not necessarily at equal-spaced time points. The arcane difference between time series and other variables is the use of subscripts.

Time series analysis comprises methods for analyzing time series data to extract some useful (meaningful) statistics and other characteristics of the data, while Time series forecasting is the use of a model to predict future values based on previously observed values.

Given an observed time series, the first step in analyzing a time series is to plot the given series on a graph, taking time intervals (t) along the X-axis (as independent variable) and the observed value ($Y_t$) on the Y-axis (as dependent variable). Such a graph will show various types of fluctuations and other points of interest.

Note

$Y_t$ is treated as a random variable. If $Y_t$ is generated by some model (Regression model for time series, i.e., $Y_t=x_t\beta +\varepsilon_t$, $E(\varepsilon_t|x_t)=0$, then ordinary least squares (OLS) provides a consistent estimate of $\beta$.
Time series is interchangeably used for a sample $\{x_t\}$ and a probability model. A possible probability model for the joint distribution of a time series $\{x_t\}$ is $x_t=\varepsilon_t$, $\varepsilon_t\sim iid N(0,\sigma_\varepsilon^2)$
Time series are typically not iid (Independent and Identically Distributed), e.g., if GNP today is unusually high, GNP tomorrow is also likely to be unusually high.

Reference

Time Series Analysis

R Programming Language

Time Series Analysis and Forecasting (2013)

Table of Contents

Time Series

Use of Time Series

Continuous Time Series

Discrete Time Series

Time Series Analysis

Reference

Table of Contents

Time Series

Use of Time Series

Continuous Time Series

Discrete Time Series

Time Series Analysis

Reference

Share this: