Chi-Square Distribution ($\chi^2$)

The Chi-square distribution is a continuous probability distribution that is used in many hypothesis tests. The Chi-Square statistic always results in a positive value.

A Chi-Square variate (with $v$ degrees of freedom (df)) is the sum of $v$ independent, squared standard normal variates ($\sum\limits_{i=1}^v z_i^2$). It is denoted by $\chi^2_v$. The variance $s^2$ from a sample of normally distributed observations is distributed as $\chi^2$ with $v$ (the df) as a parameter referred to as df of the calculated variance. Symbolically,

$$\frac{v\cdot s^2}{\sigma^2} \sim \chi^2_v$$

Chi Square Distribution Table

The variance $s^2$ for $n$ observations from a $N(\mu, \sigma^2)$, the df is equal to $v=n-1$. The Chi-Square distribution is also used for the contingency (analysis of frequency) tables as an approximation to the distribution of complex statistics. All the families of Chi-Square distribution are specified by their degrees of freedom.

Chi-Square Family of Distributions

Chi-Square Distribution Case of the Gamma Distribution

The Chi-Square distribution is a particular case of the Gamma Distribution, the pdf is

$$P_{\chi^2}(x) = [2^{v/2}\Gamma(v/2)]^{-1} \chi^{(v-2)/2}e^{-x/2}, \quad x\ge 0$$

where $\Gamma(x)$ is the Gamma Distribution.

Normal Approximation to $\chi^2$

Method 1: The PDF and df of Chi-Square can be approximated by the normal distribution. For large $v$ df, the first two moments $z=\frac{(X-v)}{\sqrt{2v}}$, $X\sim \chi^2$.

Method 2: Fisher approximation (compensates the skewness of $X$)

$$\sqrt{2X} – \sqrt{2v-1} \sim N(0, 1)$$

Method 3: Approximation by Wilson and Hilferty is quite accurate. Defining $A=\frac{2}{9v}$, we have

$$\frac{\sqrt[3]{(X/v)}-1+A}{\sqrt{A}}\sim N(0, 1)$$

For the determination of percentage points

$$\chi^2_{v[P]}=v[z_P\sqrt{A}+1-A]^3$$

Generating Pseudo Random Variates

Following the schema allows the generation of random variates from $\chi^2_v$ distribution with $v>2$ df. It requires to generate serially random variates from the standard uniform $U(0,1)$ distribution.

Let $n=v$ degrees of freedom

\begin{align*}
C1 &= 1 + \sqrt{2/e} \approx 1.8577638850\\
C2 &= \sqrt{n/2}\\
C3 &= \frac{3n^2-2}{3n(n-2)}\\
C4 &= \frac{4}{n-2}\\
C5 &= n-2\\
\end{align*}

visit: https://gmstat.com for MCQ tests from various subjects

The Z-Score Definition, Formula, Real Life Examples

Z-Score Definition: The Z-Score also referred to as standardized raw scores is a useful statistic because not only permits to computation of the probability (chances or likelihood) of the raw score (occurring within normal distribution) but also helps to compare two raw scores from different normal distributions. The Z score is a dimensionless measure since it is derived by subtracting the population mean from an individual raw score and then this difference is divided by the population standard deviation. This computational procedure is called standardizing raw score, which is often used in the Z-test of testing of hypothesis.

Any raw score can be converted to a Z-score formula by

$$Z-Score=\frac{raw score – mean}{\sigma}$$

Z-Score Real Life Examples

Example 1: If the mean = 100 and standard deviation = 10, what would be the Z-score of the following raw score

Raw ScoreZ Scores
90$ \frac{90-100}{10}=-1$
110$ \frac{110-100}{10}=1$
70$ \frac{70-100}{10}=-3$
100$ \frac{100-100}{10}=0$

Note that: If Z-Score,

  • has a zero value then it means that the raw score is equal to the population mean.
  • has a positive value then it means that the raw score is above the population mean.
  • has a negative value then it means that the raw score is below the population mean.
The Z-score Normal Bell Shaped Curve

Example 2: Suppose you got 80 marks in an Exam of a class and 70 marks in another exam of that class. You are interested in finding that in which exam you have performed better. Also, suppose that the mean and standard deviation of exam-1 are 90 and 10 and in exam-2 60 and 5 respectively. Converting both exam marks (raw scores) into the standard score, we get

$Z_1=\frac{80-90}{10} = -1$

The Z-score results ($Z_1=-1$) show that 80 marks are one standard deviation below the class mean.

$Z_2=\frac{70-60}{5}=2$

The Z-score results ($Z_2=2$) show that 70 marks are two standard deviations above the mean.

From $Z_1$ and $Z_2$ means that in the second exam, students performed well as compared to the first exam. Another way to interpret the Z score of $-1$ is that about 34.13% of the students got marks below the class average. Similarly, the Z Score of 2 implies that 47.42% of the students got marks above the class average.

Read about Standard Normal Table

Visit Online MCQs Website: gmstat.com

Standard Normal Table

A standard normal table, also called the unit normal table or Z-table, is a table for the values of Φ calculated mathematically, and these are the values from the cumulative normal distribution function. A standard normal distribution table is used to find the probability that a statistic is observed below, above, or between values on the standard normal distribution, and by extension, any normal distribution. Since probability tables cannot be printed for every normal distribution, as there is an infinite variety (families) of normal distributions, it is common practice to convert a normal to a standard normal and then use the standard normal table to find the required probabilities (area under the normal curve).

The standard normal curve is symmetrical, so the table can be used for values going in any direction, for example, a negative 0.45 or positive 0.45 has an area of 0.1736.

The Standard Normal distribution is used in various hypothesis testing procedures such as tests on single means, the difference between two means, and tests on proportions. The Standard Normal distribution has a mean of 0 and a standard deviation of 1.

The values inside the given table represent the areas under the standard normal curve for values between 0 and the relative z-score.

The table value for $$Z is 1 minus the value of the cumulative normal distribution.

Standard Normal Table (Area Under the Normal Curve)

Standard Normal Table

For example, the value for 1.96 is $P(Z>1.96) = 0.0250$.

For further details see Standard Normal

See about the measure of asymmetry

Probability in R Language