Student t-test Comparison Test

In 1908, William Sealy Gosset published his work under the pseudonym “Student” to solve problems associated with inference based on sample(s) drawn from a normally distributed population when the population standard deviation is unknown. He developed the Student t-test and t-distribution, which can be used to compare two small sets of quantitative data collected independently of one another, in this case, this t-test is called independent samples t-test or also called unpaired samples t-test.

The Student t-test is the most commonly used statistical technique in testing of hypothesis based on the difference between sample means. The student t-test can be computed just by knowing the means, standard deviations, and number of data points in both samples by using the following formula

\[t=\frac{\overline{X}_1-\overline{X}_2 }{\sqrt{s_p^2 (\frac{1}{n_1}+\frac{1}{n_2})}}\]

where $s_p^2$ is the pooled (combined) variance and can be computed as

\[s_p^2=\frac{(n_1-1)s_1^2 + (n_2-2)s_2^2}{n_1+n_2-2}\]

Using this test statistic, we test the null hypothesis $H_0:\mu_1=\mu_2$ which means that both samples came from the same population under the given “level of significance” or “level of risk”.

If the computed t-statistics from the above formula is greater than the critical value (value from t-table with $n_1+n_2-2$ degrees of freedom and given a level of significance, say $\alpha=0.05$), the null hypothesis will be rejected, otherwise, the null hypothesis will be accepted.

Note that the t-distribution is a family of curves depending on the degree of freedom (the number of independent observations in the sample minus the number of parameters). As the sample size increases, the t-distribution approaches a bell shape i.e. normal distribution.

Student t-test Example

The production manager wants to compare the number of defective products produced on the day shift with the number on the afternoon shift. A sample of the production from 6-day and 8-afternoon shifts revealed the following defects. The production manager wants to check at the 0.05 significance level, is there a significant difference in the mean number of defects per shits?

Day shift587697  
Afternoon Shit810711912149

Some required calculations for Student t-test are:

The mean of samples:

$\overline{X}_1=7$, $\overline{X}_2=10$,

Standard Deviation of samples

$s_1=1.4142$, $s_2=2.2678$ and $s_p^2=\frac{(6-1) (1.4142)^2+(8-1)(2.2678)^2}{6+8-2}=3.8333$

Step 1: Null and alternative hypothesis are: $H_0:\mu_1=\mu_2$ vs $H_1:\mu_1 \ne \mu_2$

Step 2: Level of significance: $\alpha=0.05$

Step 3: Test Statistics

$\begin{aligned}
t&=\frac{\overline{X}_1-\overline{X}_2 }{\sqrt{s_p^2 (\frac{1}{n_1}+\frac{1}{n_2})}}\\
&=\frac{7-10}{\sqrt{3.8333(\frac{1}{6}+\frac{1}{8})}}=-2.837
\end{aligned}$

Step 4: Critical value or rejection region (Reject $H_0$ if the absolute value of t-calculated in step 3 is greater than the absolute table value i.e. $|t_{calculated}|\ge t_{tabulated}|$). In this example t-tabulated is -2.179 with 12 degrees of freedom at a significance level of 5%.

Step 5: Conclusion: As computed value $|2.837| > |2.179|$, which means that the number of defects is not the same on the two shifts.

Student T-test

Student T Distribution

Independent Samples t test in SPSS

Introduction (Independent Samples t test using SPSS)

Independent Samples t test is a test for independent groups and is useful when the same variable has been measured in two independent groups and the researcher wants to know whether the difference between group means is statistically significant. “Independent groups” means that the groups have different people in them and that the people in the different groups have not been matched or paired in any way.

Objectives of Independent Samples t test

The independent t-test compares the means of two unrelated/independent groups measured on the Interval or ratio scale. The SPSS t-test procedure allows the testing of the hypothesis when variances are assumed to be equal or when are not equal and also provides the t-value for both assumptions. This test also provides the relevant descriptive statistics for both of the groups.

Assumptions (Independent Samples t test)

  • Variable can be classified into two groups independent of each other.
  • The variable is Measured on an interval or ratio scale.
  • The measured variable is approximately normally distributed
  • Both groups have similar variances  (variances are homogeneity)

Data Required for (Independent Samples t test)

Suppose a researcher wants to discover whether left and right-handed telephone operators differed in the time it took them to answer calls. The data for reaction time were obtained (RT’s measured in seconds):

Data Telephone: Independent Samples t test

The mean reaction times suggest that the left-handers were slower but does a t-test confirm this?

Independent Samples t Test using SPSS

Perform the following steps to perform the Independent Samples t-test by using the SPSS and entering the data set in the SPSS data view

1) Click Analyze > Compare Means > Independent-Samples T Test… on the top menu as shown below.

SPSS Menu option for independent samples t test

2) Select continuous variables that you want to test from the list.

Dialog box for independent samples t test

3) Click on the arrow to send the variable in the “Test Variable(s)” box. You can also double-click the variable to send it in the “Test Variable” Box.

4) Select the categorical/grouping variable so that group comparison can be made and send it to the “Grouping Variable” box.

5) Click on the “Define Groups” button. A small dialog box will appear asking about the name/code used in the variable view for the groups. We used 1 for males and 2 for females. Click the Continue button when you’re done. Then click OK when you’re ready to get the output.  See the Pictures for a Visual view.

Define Group for Independent sample t test

Independent Samples t-test SPSS Output

Independent sample t test output

The first Table in the output is about descriptive statistics concerning your variables. The number of observations, mean, variance, and standard error are available for both of the groups (male and female)

The second Table in the output is an important one concerning the testing of the hypothesis. You will see that there are two t-tests. You have to know which one to use. When comparing groups having approximately similar variances use the first t-test. Levene’s test checks for this. If the significance for Levene’s test is 0.05 or below, then it means that the “Equal Variances Not Assumed” test should be used (the second one), Otherwise use the “Equal Variances Assumed” test (first one).  Here the significance is 0.287, so we’ll be using the “Equal Variances” first row in the second table.

In the output table “t” is the calculated t-value from test statistics, for example, the t-value is 1.401

df stands for degrees of freedom, in the example, we have 18 degrees of freedom

Sig (two-tailed) means two-tailed significance value (P-Value), for example, the sig value is greater than 0.05 (significance level).

Decision

As the P-value of 0.178 is greater than our 0.05 significance level we fail to reject the null hypothesis. (two-tailed case)

As the P-value of 0.089 is greater than our 0.05 significance level we fail to reject the null hypothesis. (one tail case with 0.05 significance level)

As the P-value of 0.089 is smaller than our 0.10 significance level we reject the null hypothesis and accept the alternative hypothesis. (one tail case with 0.10 significance level). In this case, it means that the left handler has a slower reaction time as compared to the right handler on average.

Other links to study Independent Samples t-test using SPSS

R Programming Language Frequently Asked Questions