# Basic Statistics and Data Analysis

### Category: Testing of Hypothesis

Testing of Hypothesis, Hypothesis testing, Independent t test, Independent z test, Analysis of variance, ANOVA, Comparison tests

# Specifying the Null and Alternative Hypothesis

## 1) The t-test for independent samples, 2) One-way analysis of variance, 3) The t-test for correlation coefficients?, 4) The t-test for a regression coefficient.

[latexpage]

In each of these, the null hypothesis says there is no relationship and the alternative hypothesis says that there is a relationship.

1. In this case the null hypothesis says that the two population means (i.e., $\mu_1$ and  $\mu_2$) are equal; the alternative hypothesis says that they are not equal.
2. In this case the null hypothesis says that all of the population means are equal; the alternative hypothesis says that at least two of the means are not equal.
3. In this case the null hypothesis says that the population correlation (i.e., $\rho$) is zero; the alternative hypothesis says that it is not equal to zero.
4. In this case the null hypothesis says that the population regression coefficient ($\beta$) is zero, and the alternative says that it is not equal to zero.

# Type I and Type II Errors

In hypothesis testing there are two possible errors we can make: Type I and Type II errors.

• A Type I error occurs when your reject a true null hypothesis (remember that when the null hypothesis is true you hope to retain it).
α=P(type I error)=P(Rejecting the null hypothesis when it is true)
Type I error is more serious than type II error and therefore more important to avoid that a type II error.
• A Type II error occurs when you fail to reject a false null hypothesis (remember that when the null hypothesis is false you hope to reject it).
β=P(type II error) = P(accepting null hypothesis when alternative hypothesis is true)
• The best way to allow yourself to set a low alpha level (i.e., to have a small chance of making a Type I error) and to have a good chance of rejecting the null when it is false (i.e., to have a small chance of making a Type II error) is to increase the sample size.
• The key in hypothesis testing is to use a large sample in your research study rather than a small sample!

If you do reject your null hypothesis, then it is also essential that you determine whether the size of the relationship is practically significant.
The hypothesis test procedure is therefore adjusted so that there is a guaranteed “low” probability of rejecting the null hypothesis wrongly; this probability is never zero.

# Type I Error

It has become part of the statistical hypothesis testing culture.

• It is a longstanding convention.
• It reflects a concern over making type I errors (i.e., wanting to avoid the situation where you reject the null when it is true, that is, wanting to avoid “false positive” errors).
• If you set the significance level at .05, then you will only reject a true null hypothesis 5% or the time (i.e., you will only make a type I error 5% of the time) in the long run.

## Introduction

A t-test for independent groups is useful when the same variable has been measured in two independent groups and the researcher wants to know whether the difference between group means is statistically significant. “Independent groups” means that the groups have different people in them and that the people in the different groups have not been matched or paired in any way.

## Objectives

The independent t-test compares the means of two unrelated/independent groups measured on the Interval or ratio scale. The SPSS t-test procedure allows the testing of hypothesis when variances are assumed to be equal or when are not equal and also provide the t-value for both assumptions. This test also provide the relevant descriptive statistics for both of the groups.

## Assumptions

• Variable can be classified in two groups independent of each other.
• Variable is Measured on interval or ratio scale.
• Measured variable is approximately normally distributed
• Both groups have similar variances  (variances are homogeneity)

## Data

Suppose a researcher want to discover whether left and right handed telephone operators differed in the time it took them to answer calls. The data for reaction time were obtained (RT’s measured in seconds):

 Subject no. RTs (Left) Subject no. RTs (Right) 1 500 11 392 2 513 12 445 3 300 13 271 4 561 14 523 5 483 15 421 6 502 16 489 7 539 17 501 8 467 18 388 9 420 19 411 10 480 20 467 Mean 476.5 430.8 Variance Ŝ2 5341.167 5298.84

The mean reaction times suggest that the left-handers were slower but does a t-test confirm this?

## Independent Sample t Test using SPSS

Perform the Following step by running the SPSS and entering the data set in SPSS data view

1. Click Analyze > Compare Means > Independent-Samples T Test… on the top menu as shown below.

Menu option for independent sample t test

2. Select continuous variables that you want to test from the list.

Dialog box for independent sample t test

3. Click on the arrow to send the variable in the “Test Variable(s)” box. You can also double click the variable to send it in “Test Variable” Box.
4. Select the categorical/grouping variable so that group comparison can be made and send it to the “Grouping Variable” box.
5. Click on the “Define Groups” button. A small dialog box will appear asking about the name/code used in variable view for the groups. We used 1 for males and 2 for females. Click Continue button when you’re done. Then click OK when you’re ready to get the output.  See the Pictures for Visual view.

Define Group for Independent sample t test

## Output

Independent sample t test output

First Table in output is about descriptive statistics concerning your variables. Number of observations, mean, variance, and standard error is available for both of the groups (male and female)

Second Table in output is important one concerning testing of hypothesis. You will see that there are two t-tests. You have to know which one to use. When comparing groups having approximately similar variances use the first t-test. Levene’s test checks for this. If the significance for Levene’s test is 0.05 or below, then it means that the “Equal Variances Not Assumed” test should be used (second one), Otherwise use the “Equal Variances Assumed” test (first one).  Here the significance is 0.287, so we’ll be using the “Equal Variances” first row in the second table.

In output table “t” is calculated t-value from test statistics, in example t-value is 1.401

df stands for degrees of freedom, in example we have 18 degree of freedom

Sig (two tailed) means two tailed significance value (P-Value), in example sig value is greater than 0.05 (significance level).

## Decision

As the P-value 0.178 id greater than our 0.05 significance level we fail to reject the null hypothesis. (two tailed case)

As the P-value 0.089 id greater than our 0.05 significance level we fail to reject the null hypothesis. (one tail case with 0.05 significance level)

As the P-value 0.089 id smaller than our 0.10 significance level we reject the null hypothesis and accept the alternative hypothesis. (one tail case with 0.10 significance level). In this case, it means that left handler have slower reaction time as compared to right handler on average.