Currently working as Assistant Professor of Statistics in Ghazi University, Dera Ghazi Khan.
Completed my Ph.D. in Statistics from the Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan.
l like Applied Statistics, Mathematics, and Statistical Computing.
Statistical and Mathematical software used is SAS, STATA, Python, GRETL, EVIEWS, R, SPSS, VBA in MS-Excel.
Like to use type-setting LaTeX for composing Articles, thesis, etc.
Here we will discuss the graphical representation of time series data, called historigram.
As we have discussed in the introduction to Time Series, given an observed time series, the first step in analyzing a time series is to plot the given series on a graph taking time intervals ($t$) along X-axis (as an independent variable) and the observed value ($Y_t$) on Y-axis (as the dependent variable: as a function of time). Such a graph will show various types of fluctuations and other points of interest.
A historigram is a graphical representation of a time series that reveals the changes that occurred at different time periods. The first step in the prediction (or forecast) of a time series involves an examination of the set of past observations. In this case, the historigram may be a useful tool. The construction of this involves the following steps described below:
Use an appropriate scale and take time $t$ along the $x$-axis as an independent variable.
Use an appropriate scale, and plot the observed values of variable $Y$ as a dependent variable against the given points of time.
Join the plotted points by line segments to get the required graphical representation.
Historigram Example
Draw a graphical representation of the data to show the population of Pakistan in various census years.
The sequence $y_1,y_2,cdots, y_n$ of $n$ observations of a variable (say $Y$), recorded in accordance with their time of occurrence $t_1, t_2, cdots, t_n$, is called a time series. Symbolically, the variable $Y$ can be expressed as a function of time $t$ as
$$y = f(t) + e,$$
where $f(t)$ is a completely determined (or a specified sequence) that follows some systematic pattern of variation, and $e$ is a random error (probabilistic component) that follows an irregular pattern of variation. For example,
Signal: The signal is a systematic component of variation in a time series.
Noise: The noise is an irregular component of variation in a time series.
The hourly temperature recorded at a weather bureau,
The total annual yield of wheat over a number of years,
The monthly sales of fertilizer at a store,
The enrollment of students in various years in a college,
The daily sales at a departmental store, etc.
Time Series
A time series ${Y_t}$ or ${y_1,y_2,cdots,y_T}$ is a discrete-time, continuous state process where time $t=1,2,cdots,=T$ are certain discrete time points spaced at uniform time intervals.
A sequence of random variables indexed by time is called a stochastic process (stochastic means random). A data set is one possible outcome (realization) of the stochastic process. If history had been different, we would observe a different outcome, thus we can think of a time series as the outcome of a random variable.
Usually, time is taken at more or less equally spaced intervals such as minutes, hours, days, months, quarters, years, etc. More specifically, it is a set of data in which observations are arranged in chronological order (A set of repeated observations of the same variable arranged according to time).
In different fields of science (such as signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, and communications engineering among many other fields) Time-Series-Analysis is performed.
Continuous Time Series
A time series is said to be continuous when the observation is made continuously in time. The term, continuous is used for a series of this type even when the measured variable can only take a discrete set of values.
Discrete Time Series
A time series is said to be discrete when observations are taken at specific times, usually equally spaced. The term discrete is used for a series of this type even when the measured variable is continuous.
We can write a series as ${x_1,x_2,x_3,cdots,x_T}$ or ${x_t}$, where $t=1,2,3,cdots,T$. $x_t$ is treated as a random variable. The arcane difference between time-series variables and other variables is the use of subscripts.
Time series analysis comprises methods for analyzing time-series data to extract some useful (meaningful) statistics and other characteristics of the data, while time-series forecasting is the use of a model to predict future values based on previously observed values.
The first step in analyzing time-series data is to plot the given series on a graph taking time intervals ($t$) along the $X$-axis (as an independent variable) and the observed value ($Y_t$) on the $Y$-axis (as dependent variable). Such a graph will show various types of fluctuations and other points of interest.
The term “Homoscedasticity” is the assumption about the random variable $u$ (error term) that its probability distribution remains the same for all observations of $X$ and in particular that the variance of each $u$ is the same for all values of the explanatory variables, i.e the variance of errors is the same across all levels of the independent variables (Homoscedasticity: assumption about the constant variance of a random variable). Symbolically it can be represented as
This assumption is known as the assumption of homoscedasticity or the assumption of constant variance of the error term $u$’s. It means that the variation of each $u_i$ around its zero means does not depend on the values of $X$ (independent) because the error term expresses the influence on the dependent variables due to
Errors in measurement The errors of measurement tend to be cumulative over time. It is also difficult to collect the data and check its consistency and reliability. So the variance of $u_i$ increases with increasing the values of $X$.
Omitted variables Omitted variables from the function (regression model) tend to change in the same direction as $X$, causing an increase in the variance of the observation from the regression line.
The variance of each $u_i$ remains the same irrespective of small or large values of the explanatory variable i.e. $\sigma_u^2$ is not a function of $X_i$ i.e $\sigma_{u_i^2} \ne f(X_i)$.
Consequences if Homoscedasticity is not meet
If the assumption of homoscedastic disturbance (Constant Variance) is not fulfilled, the following are the Heteroscedasticity consequences:
We cannot apply the formula of the variance of the coefficient to conduct tests of significance and construct confidence intervals. The tests are inapplicable $Var(\hat{\beta}_0)=\sigma_u^2 \{\frac{\sum X^2}{n \sum X^2}\}$ and $Var(\hat{\beta}_1) = \sigma_u^2 \{\frac{1}{\sum X^2}\}$
If $u$ (error term) is heteroscedastic the OLS (Ordinary Least Square) estimates do not have minimum variance property in the class of Unbiased Estimators i.e. they are inefficient in small samples. Furthermore, they are inefficient in large samples (that is, asymptotically inefficient).
The coefficient estimates would still be statistically unbiased even if the $u$’s are heteroscedastic. The $\hat{\beta}$’s will have no statistical bias i.e. $E(\beta_i)=\beta_i$ (coefficient’s expected values will be equal to the true parameter value).
The prediction would be inefficient because the variance of prediction includes the variance of $u$ and of the parameter estimates which are not minimal due to the incidence of heteroscedasticity i.e. The prediction of $Y$ for a given value of $X$ based on the estimates $\hat{\beta}$’s from the original data, would have a high variance.
Tests for Homoscedasticity
Some tests commonly used for testing the assumption of homoscedasticity are: