Using Mathematica Built-in Functions (2014)

Introduction to Mathematica Built-in Functions

There are thousands of thousands of Mathematica Built-in Functions. Knowing a few dozen of the more important will help to do lots of neat calculations. Memorizing the names of most of the functions is not too hard as approximately all of the built-in functions in Mathematica follow naming convention (i.e. names of functions are related to the objective of their functionality), for example, the Abs function is for absolute value, Cos function is for Cosine and Sqrt is for the square root of a number.

The important thing than memorizing the function names is remembering the syntax needed to use built-in functions. Remembering many of the built-in Mathematica functions will not only make it easier to follow programs but also enhance your programming skills.

Important and Widely Used Mathematica Built-in Functions

The following is a short list related to Mathematica Built-in Functions.

  • Sqrt[ ]:   used to find the square root of a number
  • N[ ]:   used for numerical evaluation of any mathematical expression e.g. N[Sqrt[27]]
  • Log[  ]: used to find the log base 10 of a number
  • Sin[  ]: used to find trigonometric function Sin
  • Abs[  ]: used to find the absolute value of a number

Common Mathematica built-in functions include

  1. Trigonometric functions and their inverses
  2. Hyperbolic functions and their inverses
  3. logarithm and exponential functions

Every built-in function in Mathematica has two very important features

  • All Mathematica built-in functions begin with Capital letters, such as for square root we use Sqrt, for inverse cosine we use the ArCos built-in function.
  • Square brackets are always used to surround the input or argument of a function.

For computing the absolute value -12, write on command prompt Abs[-12]  instead of for example Abs(-12) or Abs{-12} etc i.e.   Abs[-12] is a valid command for computing the absolute value of -12.

Mathematica Built-in Functions

Note that:

In Mathematica single square brackets are used for input in a function, double square brackets [[ and ]] are used for lists, and parenthesis ( and ) are used to group terms in algebraic expression while curly brackets { and } are used to delimit lists. The three sets of delimiters [ ], ( ), { } are used for functions, algebraic expressions, and lists respectively.

Introduction to Mathematica

R Programming Language

MCQs General Knowledge

Time Series Analysis and Forecasting (2013)

Time Series Analysis

Time series analysis is the analysis of a series of data points over time, allowing one to answer questions such as what is the causal effect on a variable $Y$ of a change in variable $X$ over time? An important difference between time series and cross-section data is that the ordering of cases does matter in time series.

A time series $\{Y_t\}$ or $\{y_1,y_2,\cdots,y_T\}$ is a discrete-time, continuous state process where time $t=1,2,\cdots,=T$ are certain discrete time points spaced at uniform time intervals.

Usually, time is taken at more or less equally spaced intervals such as hour, day, month, quarter, or year. More specifically, it is a set of data in which observations are arranged in chronological order (A set of repeated observations of the same variable).

Use of Time Series

Time series are used in different fields of science such as statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, and communications engineering among many other fields.

Definition: A sequence of random variables indexed by time is called a stochastic process (stochastic means random) or time series for mere mortals. A data set is one possible outcome (realization) of the stochastic process. If history had been different, we would observe a different outcome, thus we can think of time series as the outcome of a random variable.

Rather than dealing with individuals as units, the unit of interest is time: the value of Y at time $t$ is $Y_t$. The unit of time can be anything from days to election years. The value of $Y_t$ in the previous period is called the first lag value: $Y_{t-1}$. The jth lag is denoted: $Y_{t-j}$. Similarly, $Y_{t+1}$ is the value of $Y_t$ in the next period. So a simple bivariate regression equation for time series data looks like: \[Y_t = \beta_0 + \beta X_t + u_t\]

Continuous Time Series

A time series is said to be continuous when observation are made continuously in time. The term continuous is used for series of this type even when the measured variable can only take a discrete set of values.

Discrete Time Series

A time series is said to be discrete when observations are taken at a specific time, usually equally spaced. The term discrete is used for series of this type even when the measured variable is a continuous variable.

Most Macroeconomic and financial data comes in the form of time series. GNP or Stock Return is an example of time series data.

We can write a series as $\{x_1,x_2,x_3,\cdots,x_T\}$ or $\{x_t\}$, where $t=1,2,3,\cdots,T$. $x_t$ is treated as a random variable.

Time series analysis refers to the branch of statistics where observations are collected sequentially in time, usually but not necessarily at equal-spaced time points. The arcane difference between time series and other variables is the use of subscripts.

Time series analysis comprises methods for analyzing time series data to extract some useful (meaningful) statistics and other characteristics of the data, while Time series forecasting is the use of a model to predict future values based on previously observed values.

Given an observed time series, the first step in analyzing a time series is to plot the given series on a graph taking time intervals (t) along the X-axis (as independent variable) and the observed value ($Y_t$) on the Y-axis (as dependent variable). Such a graph will show various types of fluctuations and other points of interest.

Time Series Analysis and Forecasting

Note

  • $Y_t$ is treated as random variable. If $Y_t$ is generated by some model (Regression model for time series i.e. $Y_t=x_t\beta +\varepsilon_t$, $E(\varepsilon_t|x_t)=0$, then ordinary least square (OLS) provides a consistent estimates of $\beta$.
  • Time series interchangeably used for sample $\{x_t\}$ and probability model. A possible probability model for the joint distribution of a time series $\{x_t\}$ is $x_t=\varepsilon_t$, $\varepsilon_t\sim iid  N(0,\sigma_\varepsilon^2)$
  • Time series are typically not iid (Independent Identically Distributed) e.g. If GNP today is unusually high, GNP tomorrow will also likely to be unusually high.

Reference:

R Programming Language

Quartiles in Statistics: Relative Measure of Observation

Quartiles in Statistics

Like Percentiles and Deciles, Quartiles is a type of Quantile, which is a measure of the relative standing of observation within the data set. The Quartiles values are three points that divide the data into four equal parts each group comprising a quarter of the data (the first quartile $Q_1$, second quartile $Q_2$ (also median), and the third quartile $Q_3$) in the order statistics.

The first quartile, (also known as the lower quartile $Q_1$) is the value of order statistic that exceeds 1/4 of the observations and less than the remaining 3/4 observations. The third quartile known as the upper quartile is the value in the order statistic that exceeds 3/4 of the observations and is less than the remaining 1/4 observations, while the second quartile is the median.

Quartiles in Statistics for Ungrouped Data

For ungrouped data, the quartiles are calculated by splitting the order statistic at the median and then calculating the median of the two halves. If $n$ is odd, the median can be included on both sides.

Example: Find the $Q_1, Q_2$ and $Q_3$ for the following ungrouped data set 88.03, 94.50, 94.90, 95.05, 84.60.Solution: We split the order statistic at the median and calculated the median of two halves. Since $n$ is odd, we can include the median in both halves. The order statistic is 84.60, 88.03, 94.50, 94.90, 95.05.

Quartiles in Statistics: Relative Measure of Observation

\begin{align*}
Q_2&=median=Y_{(\frac{n+1}{2})}=Y_{(3)}\\
&=94.50  (\text{the third observation})\\
Q_1&=\text{Median of the first three value}=Y_{(\frac{3+1}{2})}\\&=Y_{(2)}=88.03 (\text{the second observation})\\
Q_3&=\text{Median of the last three values}=Y_{(\frac{3+5}{2})}\\
&=Y_{(4)}=94.90 (\text{the fourth observation})
\end{align*}

Quartiles in Statistics for Grouped Data

For the grouped data (in ascending order) the quartiles are calculated as:
\begin{align*}
Q_1&=l+\frac{h}{f}(\frac{n}{4}-c)\\
Q_2&=l+\frac{h}{f}(\frac{2n}{4}-c)\\
Q_3&=l+\frac{h}{f}(\frac{3n}{4}-c)
\end{align*}
where
$l$    is the lower class boundary of the class containing the $Q_1, Q_2$ or $Q_3$.
$h$    is the width of the class containing the $Q_1, Q_2$ or $Q_3$.
$f$    is the frequency of the class containing the $Q_1, Q_2$ or $Q_3$.
$c$    is the cumulative frequency of the class immediately preceding the class containing $Q_1, Q_2$ or $Q_3, \left[\frac{n}{4},\frac{2n}{4} \text{or} \frac{3n}{4}\right]$ are used to locate $Q_1, Q_2$ or $Q_3$ group.

Quartiles in Statistics: Relative Measure of Observation

Quartiles in Statistics Example: Find the quartiles for the following grouped data

Solution: To locate the class containing $Q_1$, find $\frac{n}{4}$th observation which is here $\frac{30}{4}$th observation i.e. 7.5th observation. Note that the 7.5th observation falls in the group ($Q_1$ group) 90.5–95.5.
\begin{align*}
Q_1&=l+\frac{h}{f}(\frac{n}{4}-c)\\
&=90.5+\frac{5}{4}(7.5-6)=90.3750
\end{align*}

For $Q_2$, the $\frac{2n}{4}$th observation=$\frac{2 \times 30}{4}$th observation = 15th observation falls in the group 95.5–100.5.
\begin{align*}
Q_2&=l+\frac{h}{f}(\frac{2n}{4}-c)\\
&=95.5+\frac{5}{10}(15-10)=98
\end{align*}

For $Q_3$, the $\frac{3n}{4}$th observation=$\frac{3\times 30}{4}$th = 22.5th observation. So
\begin{align*}
Q_3&=l+\frac{h}{f}(\frac{3n}{4}-c)\\
&=100.5+\frac{5}{6}(22.5-20)=102.5833
\end{align*}

Application of Quartiles

By analyzing quartiles, one can get insights into the:

  • Spread of the data: The distance between $Q_1$ and $Q_3$ (called the interquartile range or IQR) indicates how spread out the data is. A relatively large IQR indicates a wider distribution, while a small IQR shows that the data is more concentrated around the median ($Q_2$).
  • Presence of outliers: If the data points are extremely far from the quartiles, they might be outliers that could skew the analysis of measures like the mean.
Statistics Help

Reference:

R Frequently Asked Questions

Online MCQs Test Quiz with Answers

Non Central Chi Square Distribution (2013)

The Non Central Chi Square Distribution is a generalization of the Chi-Square Distribution.
If $Y_{1} ,Y_{2} ,\cdots ,Y_{n} \sim N(0,1)$ i.e. $(Y_{i} \sim N(0,1)) \Rightarrow y_{i}^{2} \sim \psi _{i}^{2}$ and $\sum y_{i}^{2}  \sim \psi _{(n)}^{2} $

If mean ($\mu $) is non-zero then $y_{i} \sim N(\mu _{i} ,1)$ i.e each $y_{i} $ has different mean
\begin{align*}
\Rightarrow  & \qquad y_i^2 \sim \psi_{1,\frac{\mu_i^2}{2}} \\
\Rightarrow  & \qquad \sum y_i^2 \sim \psi_{(n,\frac{\sum \mu_i^2}{2})} =\psi_{(n,\lambda )}^{2}
\end{align*}

Note that if $\lambda =0$ then we have central $\psi ^{2} $. If $\lambda \ne 0$ then it is a noncentral chi-squared distribution because it has no central mean (as distribution is not standard normal).

Central Chi Square Distribution $f(x)=\frac{1}{2^{\frac{n}{2}} \left|\! {\overline{\frac{n}{2} }}  \right. } \chi ^{\frac{n}{2} -1} e^{-\frac{x}{2} }; \qquad 0<x<\infty $

Theorem:

If $Y_{1} ,Y_{2} ,\cdots ,Y_{n} $ are independent normal random variables with $E(y_{i} )=\mu _{i} $ and $V(y_{i} )=1$ then $w=\sum y_{i}^{2}  $ is distributed as non central chi-square with $n$ degree of freedom and non-central parameter $\lambda $, where $\lambda =\frac{\sum \mu _{i}^{2}  }{2} $ and has pdf

\begin{align*}
f(w)=e^{-\lambda } \sum _{i=0}^{\infty }\left[\frac{\lambda ^{i} w^{\frac{n+2i}{2} -1} e^{-\frac{w}{2} } }{i!\, 2^{\frac{n+2i}{2} } \left|\! {\overline{\frac{n+2i}{2} }}  \right. } \right]\qquad 0\le w\le \infty
\end{align*}

Proof: Non Central Chi Square Distribution

Consider the moment generating function of $w=\sum y_{i}^{2}  $

\begin{align*}
M_{w} (t)=E(e^{wt} )=E(e^{t\sum y_{i}^{2}  } ); \qquad \text{ where } y_{i} \sim N(\mu \_{i} ,1)
\end{align*}

By definition
\begin{align*}
M_{w} (t) &= \int \cdots \int e^{t\sum y_{i}^{2} } .f(y_{i} )dy_{i} \\
&= K_{1} \int \cdots \int e^{-\frac{1}{2} (1-2t)\left[\sum y_{i}^{2} -\frac{2\sum y_{i} \mu _{i} }{1-2t} \right]}   dy_{1} .dy_{2} \cdots dy_{n} \\
&\text{By completing square}\\
& =K_{1} \int \cdots \int e^{\frac{1}{2} (1-2t)\sum \left[\left[y_{i} -\frac{\mu _{i} }{1-2t} \right]^{2} -\frac{\mu _{i}^{2} }{(1-2t)^{2} } \right]}   dy_{1} .dy_{2} \cdots dy_{n} \\
&= e^{-\frac{\sum \mu_{i}^{2} }{2} \left(1-\frac{1}{1-2t} \right)} \int \cdots \int \left(\frac{1}{\sqrt{2\pi } } \right)^{n} \frac{\frac{1}{\left(\sqrt{1-2t} \right)^{n} } }{\frac{1}{\left(\sqrt{1-2t} \right)^{n} } }  \, e^{-\frac{1}{2.\frac{1}{1-2t} } .\sum \left(y_{i} -\frac{\mu _{i} }{1-2t} \right)^{2} }  dy_{1} .dy_{2} \cdots dy_{n}\\
&=e^{-\frac{\sum \mu _{i}^{2} }{2} \left(1-\frac{1}{1-2t} \right)} .\frac{1}{\left(\sqrt{1-2t} \right)^{n} } \int \cdots \int \left(\frac{1}{\sqrt{2\pi } } \right)^{n}  \frac{1}{\left(\sqrt{\frac{1} {1-2t}} \right)^n} e^{-\, \frac{1}{2.\frac{1}{1-2t} } .\sum \left(y_{i} -\frac{\mu_i}{1-2t}\right)^{2} } dy_{1} .dy_{2} \cdots dy_{n}\\
\end{align*}

where

\[\int_{-\infty}^{\infty } \cdots \int _{-\infty }^{\infty }\left(\frac{1}{\sqrt{2\pi}} \right)^{n} \frac{1}{\left(\frac{1}{1-2t} \right)^{\frac{n}{2}}} e^{-{\frac{1}{2}.\frac{1}{1-2t} }} .\sum \left(y_{i} -\frac{\mu _{i} }{1-2t} \right)^{2} dy_{1} .dy_{2} \cdots dy_{n}\]
is integral to complete density

\begin{align*}
M_{w}(t)&=e^{-\frac{\sum \mu_i^2}{2} \left(1-\frac{1}{1-2t}\right)} .\left(\frac{1}{\sqrt{1-2t} } \right)^{n} \\
&=\left(\frac{1}{\sqrt{1-2t}}\right)^{n} e^{-\lambda \left(1-\frac{1}{1-2t} \right)} \\
&=e^{-\lambda }.e^{\frac{\lambda}{1-2t}} \frac{1}{(1-2t)^{\frac{n}{2}}}\\
&\text{Using Taylor series about zero}\\
&=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} }{i!(1-2t)^{i} (1-2t)^{n/2} }\\
M_{w=y_{i}^{2} } (t)&=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} }{i!(1-2t)^{\frac{n+2i}{2} } }\tag{A}
\end{align*}

Now Moment Generating Function (MGF) for non central Chi Square distribution for a given density function is
\begin{align*}
M_{\omega} (t) & = E(e^{\omega t} )\\
&=\int _{0}^{\infty }e^{\omega \lambda } e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} \omega ^{\frac{n+2i}{2} -1} e^{-\frac{\omega }{2} } }{i!2^{\frac{n+2i}{2} } \left|\! {\overline{\frac{n+2i}{2} }}  \right. } d\omega\\
&=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} }{i!2^{\frac{n+2i}{2} } \left|\! {\overline{\frac{n+2i}{2} }}  \right. }  \int _{0}^{\infty }e^{\frac{\omega }{2} (1-2t)}  \omega ^{\frac{n+2i}{2} -1} d\omega
\end{align*}
Let
\begin{align*}
\frac{\omega }{2} (1-2t)&=P\\
\Rightarrow \omega & =\frac{2P}{1-2t} \\
\Rightarrow d\omega &=\frac{2dp}{1-2t}
\end{align*}

\begin{align*}
&=e^{-\lambda } \sum\limits_{i=0}^{\infty }\frac{\lambda ^{i} }{i!2^{\frac{n+2i}{2} } \left|\! {\overline{\frac{n+2i}{2} }}  \right. }  \int _{0}^{\infty }e^{-P} \left(\frac{2P}{1-2t} \right)^{\frac{n+2i}{2} -1} \frac{2dP}{1-2t}  \\
&=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} 2^{\frac{n+2i}{2} } }{i!2^{\frac{n+2i}{2} } \left|\! {\overline{\frac{n+2i}{2} }}  \right. (1-2t)^{\frac{n+2i}{2} -1} } \int _{0}^{\infty }e^{-P} P^{\frac{n+2i}{2} -1}  dP \\
&=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} }{i!\left|\! {\overline{\frac{n+2i}{2} }}  \right. (1-2t)^{\frac{n+2i}{2} } } \left|\! {\overline{\frac{n+2i}{2} }}  \right.
\end{align*}

as \[\int\limits _{0}^{\infty }e^{-P} P^{\frac{n+2i}{2} -1}  dP=\left|\! {\overline{\frac{n+2i}{2} }}  \right. \]

\[M_{\omega } (t)=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} }{i!(1-2t)^{\frac{n+2i}{2} } }  \tag{B}\]

Comparing ($A$) and ($B$)
\[M_{w=\sum y_{i}^{2} } (t)=M_{\omega } (t)\]

Non Central Chi Square Distribution

By Uniqueness theorem

\[f_{w} (w)=f_{\omega } (\omega )\]
\begin{align*}
\Rightarrow \qquad f_{w} (t)&=f(\psi ^{2} )\\
&=e^{-\lambda } \sum _{i=0}^{\infty }\frac{\lambda ^{i} w^{\frac{n+2i}{2} -1} e^{-\frac{w}{2} } }{i!2^{\frac{n+2i}{2} } \left|\! {\overline{\frac{n+2i}{2} }}  \right. };  \qquad o\le w\le \infty
\end{align*}
is the pdf of non central chi square with $n$ degrees of freedom and $\lambda =\frac{\sum \mu _{i}^{2} }{2} $ is the non-centrality parameter. Non Central Chi Square distribution is also Additive to Central Chi Square distribution.

Application of Non Central Chi Square Distribution

  • Power analysis: Non Central Chi Square Distribution is useful in calculating the power of chi-squared tests.
  • Non-normal data: When the underlying data is not normally distributed, the non central chi squared distribution can be used in certain tests that rely on chi-squared approximations.
  • Signal processing: In some areas like radar systems, the non central chi squared distribution arises when modeling signals with background noise.
https://itfeature.com

Reference: