Testing of Hypothesis

The Degrees of Freedom

Jul 6, 2025Apr 13, 2015 by Muhammad Imdad Ullah

The degrees of freedom (df) or several degrees of freedom refers to the number of observations in a sample minus the number of (population) parameters being estimated from the sample data. All this means that the degrees of freedom are a function of both sample size and the number of independent variables. In other words, it is the number of independent observations out of a total of ($n$) observations.

Degrees of Freedom

In statistics, the df are considered the number of values in a study that are free to vary. Degree of freedom example in real life: if you have to take ten different courses to graduate, and only ten different courses are offered, then you have nine degrees of freedom. In nine semesters, you will be able to choose which class to take; in the tenth semester, there will only be one class left to take – there is no choice, if you want to graduate, this is the concept of the degrees of freedom (df) in statistics.

Let a random sample of size $n$ be taken from a population with an unknown mean $\overline{X}$. The sum of the deviations from their means is always equal to zero, i.e., $\sum_{i=1}^n (X_i-\overline{X})=0$. This requires a constraint on each deviation $X_i-\overline{X}$ used when calculating the variance.

\[S^2 =\frac{\sum_{i=1}^n (X_i-\overline{X})^2 }{n-1}\]

This constraint (restriction) implies that $n-1$ deviations completely determine the nth deviation. The $n$ deviations (and also the sum of their squares and the variance in the $S^2$ of the sample) therefore have $n-1$ degrees of freedom.

Common Way of Thinking DF

A common way to think of df is the number of independent pieces of information available to estimate another piece of information. More concretely, the number of degrees of freedom is the number of independent observations in a sample of data that are available to estimate a parameter of the population from which that sample is drawn. For example, if we have two observations, when calculating the mean, we have two independent observations; however, when calculating the variance, we have only one independent observation, since the two observations are equally distant from the mean.

Calculating DF

Single sample: For $n$ observations, one parameter (mean) needs to be estimated, which leaves $n-1$ degrees of freedom for estimating variability (dispersion).

Two samples: There are a total of $n_1+n_2$ observations ($n_1$ for group1 and $n_2$ for group2,) and two means need to be estimated, which leaves $n_1+n_2-2$ degrees of freedom for estimating variability.

Regression with p predictors: There are $n$ observations with $p+1$ parameters that need to be estimated (regression coefficient for each predictor and the intercept). This leaves $n-p-1$ degrees of freedom of error, which accounts for the error degrees of freedom in the ANOVA table.

DF in Statistical Distributions

Several commonly encountered statistical distributions (Student’s t, Chi-Squared, F) have parameters that are commonly referred to as df. This terminology simply reflects that in many applications where these distributions occur, the parameter corresponds to the degrees of freedom of an underlying random vector. If $X_i; i=1,2,\cdots, n$ are independent normal $(\mu, \sigma^2)$ random variables, the statistic (formula) is $\frac{\sum_{i=1}^n (X_i-\overline{X})^2}{\sigma^2}$, follows a chi-squared distribution with $n-1$ degree of freedom. Here, the degree of freedom arises from the residual sum of squares in the numerator and in turn the $n-1$ degree of freedom of the underlying residual vector $X_i-\overline{X}$.

Degrees of freedom (DF) represent the number of independent values in a statistical calculation that can vary without violating constraints. They play a crucial role in hypothesis testing, regression analysis, and probability distributions

Computer MCQs Online Test

R Programming Language

Effect Size Definition, Formula

Mar 25, 2025Feb 8, 2014 by Muhammad Imdad Ullah

Effect Size Definition

The Effect Size definition: An effect size is a measure of the strength of a phenomenon, conveying the estimated magnitude of a relationship without making any statement about the true relationship. Effect size measure(s) play an important role in meta-analysis and statistical power analyses. So reporting effect size in thesis, reports or research reports can be considered as a good practice, especially when presenting some empirical results/ findings because it measures the practical importance of a significant finding. Simply, we can say that effect size is a way of quantifying the size of the difference between the two groups.

Effect size is usually computed after rejecting the null hypothesis in a statistical hypothesis testing procedure. So if the null hypothesis is not rejected (i.e. accepted) then effect size has little meaning.

There are different formulas for different statistical tests to measure the effect size. In general, the effect size can be computed in two ways.

As the standardized difference between two means
As the effect size correlation (correlation between the independent variables classification and the individual scores on the dependent variable).

The Effect Size Dependent Sample T-test

The effect size of paired sample t-test (dependent sample t-test) known as Cohen’s d (effect size) ranging from $-\infty$ to $\infty$ evaluated the degree measured in standard deviation units that the mean of the difference scores is equal to zero. If the value of d equals 0, then it means that the difference scores are equal to zero. However larger the d value from 0, the more the effect size.

Effect Size Formula for Dependent Sample T-test

The effect size for the dependent sample t-test can be computed by using

\[d=\frac{\overline{D}-\mu_D}{SD_D}\]

Note that both the Pooled Mean (D) and standard deviation are reported in SPSS output under paired differences.

Let the effect size, $d = 2.56$ which means that the sample means difference and the population mean difference is 2.56 standard deviations apart. The sign does not affect the size of an effect i.e. -2.56 and 2.56 are equivalent effect sizes.

The $d$ statistics can also be computed from the obtained $t$ value and the number of paired observations by Ray and Shadish (1996) such as

\[d=\frac{t}{\sqrt{N}}\]

The value of $d$ is usually categorized as small, medium, and large. With Cohen’s $d$:

d=0.2 to 0.5 small effect
d=0.5 to 0.8, medium effect
d= 0.8 and higher, large effect.

Calculating Effect Size from $R^2$

Another method of computing the effect size is with r-squared ($r^2$), i.e.

\[r^2=\frac{t^2}{t^2+df}\]

Effect size is categorized into small, medium, and large effects as

$r^2=0.01$, small effect
$r^2=0.09$, medium effect
$r^2=0.25$, large effect.

The non‐significant results of the t-test indicate that we failed to reject the hypothesis that the two conditions have equal means in the population. A larger value of $r^2$ indicates the larger effect (effect size), while a large effect size with a non‐significant result suggests that the study should be replicated with a larger sample size.

So larger value of effect size computed from either method indicates a very large effect, meaning that means are likely very different.

Choosing the Right Effect Size Measure

The appropriate effect size measure depends on the type of analysis being conducted (for example, correlation, group comparison, etc.) and the scale measurement of the data (continuous, binary, nominal, ration, interval, ordinal, etc.). It is always a good practice to report both effect size and statistical significance (p-value) to provide a more complete picture of your findings.

In conclusion, effect size is a crucial concept in interpreting statistical results. By understanding and reporting effect size, one can gain a deeper understanding of the practical significance of the research findings and contribute to a more comprehensive understanding of the field of study.

References:

Ray, J. W., & Shadish, W. R. (1996). How interchangeable are different estimators of effect size? Journal of Consulting and Clinical Psychology, 64, 1316-1325. (see also “Correction to Ray and Shadish (1996)”, Journal of Consulting and Clinical Psychology, 66, 532, 1998)
Kelley, Ken; Preacher, Kristopher J. (2012). “On Effect Size”. Psychological Methods 17 (2): 137–152. doi:10.1037/a0028086.

Learn more about Effect Size Definition and Statistical Significance

R Language Basics

FAQS about Effect Size Definition

Explain What is effect size.
Write down the effect size formula for the dependent sample test.
Write down the effect size formula for the independent samples test.
Explain how effect size for $R^2$ is calculated.
Explain how to choose the right effect size measure.
What are small, medium, and large effect sizes?
What will be the effect size if the null hypothesis is accepted?

Testing of Hypothesis (2012)

Jan 12, 2025Aug 26, 2012 by Muhammad Imdad Ullah

Introduction

The objective of testing hypotheses (Testing of Statistical Hypothesis) is to determine if an assumption about some characteristic (parameter) of a population is supported by the information obtained from the sample.

The terms hypothesis testing or testing of the hypothesis are used interchangeably. A statistical hypothesis (different from a simple hypothesis) is a statement about a characteristic of one or more populations such as the population mean. This statement may or may not be true. The validity of the statement is checked based on information obtained by sampling from the population.
Testing of Hypothesis refers to the formal procedures used by statisticians to accept or reject statistical hypotheses that include:

i) Formulation of Null and Alternative Hypothesis

Null hypothesis

A hypothesis formulated for the sole purpose of rejecting or nullifying it is called the null hypothesis, usually denoted by H₀. There is usually a “not” or a “no” term in the null hypothesis, meaning that there is “no change”.

For Example, The null hypothesis is that the mean age of M.Sc. students is 20 years. Statistically, it can be written as $H_0:\mu = 20$. Generally speaking, the null hypothesis is developed for testing.
We should emphasize that if the null hypothesis is not rejected based on the sample data we cannot say that the null hypothesis is true. In another way, failing to reject the null hypothesis does not prove that the $H_0$ is true, it means that we have failed to disprove $H_0$.

For the null hypothesis, we usually state that “there is no significant difference between “A” and “B”. For example, “the mean tensile strength of copper wire is not significantly different from some standard”.

Alternative Hypothesis

Any hypothesis different from the null hypothesis is called an alternative hypothesis denoted by $H_1$. Or we can say that a statement is accepted if the sample data provide sufficient evidence that the null hypothesis is false. The alternative hypothesis is also referred to as the research hypothesis.

It is important to remember that no matter how the problem is stated, the null hypothesis will always contain the equal sign, and the equal sign will never appear in the alternate hypothesis. It is because the null hypothesis is the statement being tested and we need a specific value to include in our calculations. The alternative hypothesis for the example given in the null hypothesis is $H_1:\mu \ne 20$.

Simple and Composite Hypothesis

If a statistical hypothesis completely specifies the form of the distribution as well as the value of all parameters, then it is called a simple hypothesis. For example, suppose the age distribution of the first-year college student follows $N(16, 25)$, and the null hypothesis is $H_0: \mu =16$ then this null hypothesis is called a simple hypothesis, and if a statistical hypothesis does not completely specify the form of the distribution, then it is called a composite hypothesis. For example, $H_1:\mu < 16$ or $H_1:\mu > 16$.

ii) Level of Significance

The level of significance (significance level) is denoted by the Greek letter alpha ($\alpha$). It is also called the level of risk (as there is the risk you take of rejecting the null hypothesis when it is true). The level of significance is defined as the probability of making a type-I error. It is the maximum probability with which we would be willing to risk a type-I error. It is usually specified before any sample is drawn so that the results obtained will not influence our choice.

In practice 10% (0.10) 5% (0.05) and 1% (0.01) levels of significance are used in testing a given hypothesis. A 5% level of significance means that there are about 5 chances out of 100 that we would reject the true hypothesis i.e. we are 95% confident that we have made the right decision. The hypothesis that has been rejected at a 0.05 level of significance means that we could be wrong with a probability of 0.05.

Selection of Level of Significance

In Testing of Hypothesis, the selection of the level of significance depends on the field of study. Traditionally 0.05 level is selected for business science-related problems, 0.01 for quality assurance, and 0.10 for political polling and social sciences.

Type-I and Type-II Errors

Whenever we accept or reject a statistical hypothesis based on sample data, there are always some chances of making incorrect decisions. Accepting a true null hypothesis or rejecting a false null hypothesis leads to a correct decision, and accepting a false hypothesis or rejecting a true hypothesis leads to an incorrect decision. These two types of errors are called type-I errors and type-II errors.
type-I error: Rejecting the null hypothesis when it is ($H_0$) true.
type-II error: Accepting the null hypothesis when $H_1$ is true.

iii) Test Statistics

The third step of Testing the Hypothesis is a procedures that enable us to decide whether to accept or reject the hypothesis or to determine whether observed samples differ significantly from expected results. These are called tests of hypothesis, tests of significance, or rules of decision. We can also say that test statistics is a value calculated from sample information, used to determine whether to reject the null hypothesis.

The test statistics for mean $\mu$ when $\sigma$ is known is $Z= \frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n} } }$, where Z-value is based on the sampling distribution of $\overline{X}$, which follows the normal distribution with mean $\mu_{\overline{X}}$ equal to $\mu$ and standard deviation $\sigma_{\overline{X}}$ which is equal to $\frac{\sigma}{\sqrt{n}}$. Thus we determine whether the difference between $\overline{X}$ and $\mu$ is statistically significant by finding the number of standard deviations $\overline{X}$ from $\mu$ using the Z statistics. Other test statistics are also available such as $t$, $F$, and $\chi^2$, etc.

iv) Critical Region (Formulating Decision Rule)

It must be decided before the sample is drawn under what conditions (circumstances) the null hypothesis will be rejected. A dividing line must be drawn defining “Probable” and “Improbable” sample values given that the null hypothesis is a true statement. Simply a decision rule must be formulated having specific conditions under which the null hypothesis should be rejected or should not be rejected. This dividing line defines the region or area of rejection of those values that are large or small that the probability of their occurrence under a null hypothesis is rather remote i.e. Dividing line defines the set of possible values of the sample statistic that leads to rejecting the null hypothesis called the critical region.

One-tailed and two-tailed tests of significance

In testing of hypothesis if the rejection region is on the left or right tail of the curve then it is called a one-tailed hypothesis. It happens when the null hypothesis is tested against an alternative hypothesis having a “greater than” or a “less than” type.

and if the rejection region is on the left and right tail (both sides) of the curve then it is called a two-tailed hypothesis. It happens when the null hypothesis is tested against an alternative hypothesis having a “not equal to sign” type.

v) Making a Decision

In this last step of testing hypotheses, the computed value of the test statistic is compared with the critical value. If the sample statistic falls within the rejection region, the null hypothesis will be rejected or otherwise accepted. Note that only one of two decisions is possible in hypothesis testing, either accept or reject the null hypothesis. Instead of “accepting” the null hypothesis ($H_0$), some researchers prefer to phrase the decision as “Do not reject $H_0$” “We fail to reject $H_0$” or “The sample results do not allow us to reject $H_0$”.

Data Analysis in R Language

Hypothesis Testing Frequently Asked Questions

What is a statistical hypothesis?
What is a null hypothesis?
What is an alternative hypothesis?
How null and alternative hypotheses are mathematically represented?
What is the level of significance (level of risk)?
What are type-I errors and type-II errors?
What is the test statistics for one sample?
What is the test statistics for the two samples?
What is the critical region?
How decision is made in hypothesis testing?
What is a simple and composite hypothesis?
What is the calculated test value?

The Degrees of Freedom

Table of Contents

Degrees of Freedom

Common Way of Thinking DF

Calculating DF

DF in Statistical Distributions

Effect Size Definition, Formula

Effect Size Definition

Table of Contents

The Effect Size Dependent Sample T-test

Effect Size Formula for Dependent Sample T-test

Calculating Effect Size from $R^2$

Choosing the Right Effect Size Measure

FAQS about Effect Size Definition

Testing of Hypothesis (2012)

Introduction

Table of Contents

Testing of Hypothesis

i) Formulation of Null and Alternative Hypothesis

Null hypothesis

Alternative Hypothesis

Simple and Composite Hypothesis

ii) Level of Significance

Selection of Level of Significance

Type-I and Type-II Errors

iii) Test Statistics

iv) Critical Region (Formulating Decision Rule)

One-tailed and two-tailed tests of significance

v) Making a Decision

Hypothesis Testing Frequently Asked Questions

Table of Contents

Degrees of Freedom

Common Way of Thinking DF

Calculating DF

DF in Statistical Distributions

Share this:

Effect Size Definition

Table of Contents

The Effect Size Dependent Sample T-test

Effect Size Formula for Dependent Sample T-test

Calculating Effect Size from $R^2$

Choosing the Right Effect Size Measure

FAQS about Effect Size Definition

Share this:

Introduction

Table of Contents

Testing of Hypothesis

i) Formulation of Null and Alternative Hypothesis

Null hypothesis

Alternative Hypothesis

Simple and Composite Hypothesis

ii) Level of Significance

Selection of Level of Significance

Type-I and Type-II Errors

iii) Test Statistics

iv) Critical Region (Formulating Decision Rule)

One-tailed and two-tailed tests of significance

v) Making a Decision

Hypothesis Testing Frequently Asked Questions

Share this: