Types of Hypothesis Tests in Statistics

Introduction to Types of Hypothesis Tests

In statistics, hypothesis tests are methods used to make inferences or draw conclusions about a population based on sample data. In this pose, we will discuss the Basic Types of Hypothesis Tests in Statistics. There are three basic types of hypothesis tests, namely (i) Left-Tailed Test, (ii) Right-Tailed Test, and (iii) Two-Tailed Test.

Note that I am not talking about Statistical tools used under specific conditions related to the data type and distribution. I am talking about the nature of the hypotheses being tested. Therefore, I will focus in this post on the area under the curve in the tails. In hypothesis testing, the distribution of the test’s rejection region can be characterized as either one-tailed or two-tailed. The one-tailed tests include both left- and right-tailed tests.

Hypothesis-Testing-Tails-Critical-Region

Left-Tailed Test

The left-tailed tests are used when the null hypothesis is being tested in a claim that the population parameter at least ($\ge$) a given value. Note that the alternative hypothesis then claims that the parameter is less than (<) the value. For example,

A tire manufacturer claims that their tires last on average more than 35000 miles. If one thinks that the claim is false, then one would write the claim as $H_o$, remembering to include the condition of equality. The hypothesis for this test would be: 
$$H_o:\mu\ge 35000$$
$$H_1: \mu<35000$$

One would hope that the sample data would allow the rejection of the null hypothesis, refuting the company’s claim.

The $H_o$ will be rejected in the case above if the sample mean is statistically significantly less than 35000. That is, if the sample mean is in the left-tail of the distribution of all sample means.

Right Tailed Test

The right-tailed test is used when the null hypothesis ($H_0$) being tested is a claim that the population parameter is at most ($\le$) a given value. Note that the alternative hypothesis ($H_1$) then claims that the parameter is greater than (>) the value.

Suppose, you worked for the tire company and wanted to gather evidence to support their claim then you will make the company's claim $H_1$ and remember that equality will not be included in the claim (H_o$). The hypothesis test will be

$$H_0:\mu \le 35000$$
$$H_1:\mu > 35000$$

If the sample data was able to support the rejection of $H_o$ this would be strong evidence to support the claim $H_1$ which is what the company believes to be true.

One should reject $H_o$ in this case if the sample mean was significantly more than 35000. That is, if the sample mean is in the right-tailed of the distribution of all sample means.

Two-Tailed Test

The two-tailed test is used when the null hypothesis ($H_o$ begins tested as a claim that the population parameter is equal to (=) a given value. Note that the alternative hypothesis ($H_1$) then claims that the parameter is not equal to ($\ne$) the value. For example, the Census Bureau claims that the percentage of Punjab’s area residents with a bachelor’s degree or higher is 24.4%. One may write the null and alternative hypotheses for this claim as:

$$H_o: P = 0.244$$
$$H_1: P \ne 0.244$$

In this case, one may reject $H_o$ if the sample percentage is either significantly more than 24.4% or significantly less than 24.4%. That is if the sample proportion was in either tail (both tails) of the distribution of all sample proportions.

Key Differences

  • Directionality: One-tailed tests look for evidence of an effect in one specific direction, while two-tailed tests consider effects in both directions.
  • Rejection Regions: In a one-tailed test, all of the rejection regions are in one tail of the distribution; in a two-tailed test, the rejection region is split between both tails.
Statistics and Data Analysis Types of Hypothesis Tests in Statistics

Online Quiz and Test Website

R Programming Language and Frequently Asked Questions

MCQs on Statistical Inference 9

The quiz is about MCQs on Statistical Inference with Answers. The quiz contains 20 questions about hypothesis testing and p-values. It covers the topics of formulation of the null and alternative hypotheses, level of significance, test statistics, region of rejection, and decision about acceptance and rejection of the hypothesis. Let us start with the Quiz MCQs on Statistical Inference.

Online MCQs on Statistical Inference with Answers

1. Suppose a research article indicates a $p = 0.001$ value in the results section ($\alpha = 0.05$).

The probability that the results of the given study are replicable is not equal to $1-p$.

 
 
 

2. After finding a single statistically significant p-value we can conclude that ————-, but it would be incorrect to conclude that ————.

 
 
 
 

3. When the null hypothesis is true, the probability of finding a specific p-value is ————-.

 
 
 
 

4. Person A is very skeptical about homeopathy. Person B believes strongly in homeopathy. They both read a study about homeopathy, which reports a positive effect and $p < 0.05$. Person A would be more likely than Person B to conclude that ———-, and Person B would be more likely than Person A to think that ————-.

 
 
 
 

5. Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$).

The alternative hypothesis has been shown to be false.

 
 
 

6. Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$).

You have found the probability of the null hypothesis being true ($p = 0.30$).

 
 
 
 

7. Suppose a research article indicates a value of $p = 0.30$ in the results section ($\alpha = 0.05$).

Obtaining a statistically non-significant result implies that the effect detected is unimportant.

 
 
 

8. Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$).

The null hypothesis has been shown to be false.

 
 
 

9. Suppose a research article indicates a $p = 0.001$ value in the results section ($\alpha = 0.05$).

You have absolutely proven your alternative hypothesis (that is, you have proven that there is a difference between the population means).

 
 
 

10. Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$).

You have proven the null hypothesis (that is, you have proven that there is no difference between the population means).

 
 
 

11. Suppose a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$).

The p-value of a statistical test is the probability of the observed result or a more extreme result, assuming the null hypothesis is true.

 
 
 

12. You perform two studies to test a potentially life-saving drug. Both studies have 80% power. What is the chance of two type 2 errors (of false negatives) in a row?

 
 
 
 

13. Suppose a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$).

You have found the probability of the null hypothesis being true ($p = .001$).

 
 
 

14. When the difference between means is 5, and the standard deviation is 4, Cohen’s d is ————— which is ————— according to the benchmarks proposed by Cohen.

 
 
 
 

15. Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$).

The p-value gives the probability of obtaining a significant result whenever a given experiment is replicated.

 
 
 

16. It is important to have access to all (and not just statistically significant) research findings to be able to ————. A consequence of publication bias is that ———–.

 
 
 
 

17. Study A and B are completely identical, except that all tests reported in Study A were pre-registered at a publicly available location (and the reported tests match the pre-registered tests), but all tests in Study B are not pre-registered. Both contain analyses with covariates. Based on research on flexibility in the data analysis, we can expect that on average study A will have ————; the covariate analyses are ————-.

 
 
 
 

18. Suppose a research article indicates a $p = 0.001$ value in the results section ($\alpha = 0.05$).

Obtaining a statistically significant result implies that the effect detected is important.

 
 
 

19. When $H_0$ is true, the probability that at least 1 out of a $X$ completely independent findings is a Type 1 error is equal to ————, this probability ———— when you look at your data and collect more data if a test is not significant.

 
 
 
 

20. A Type-I error is ————–, and the Type-I error rate is determined by ————–.

 
 
 
 

MCQs on Statistical Inference with Answers

  • A Type-I error is ————–, and the Type-I error rate is determined by ————–.
  • Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$). You have found the probability of the null hypothesis being true ($p = 0.30$).
  • Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$). You have proven the null hypothesis (that is, you have proven that there is no difference between the population means).
  • Suppose that a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The null hypothesis has been shown to be false.
  • Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$). The p-value gives the probability of obtaining a significant result whenever a given experiment is replicated.
  • Suppose a research article indicates a value of $p = 0.30$ in the results section ($\alpha = 0.05$). Obtaining a statistically non-significant result implies that the effect detected is unimportant.
  • Suppose a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). The p-value of a statistical test is the probability of the observed result or a more extreme result, assuming the null hypothesis is true.
  • Suppose a research article indicates a $p = 0.001$ value in the results section ($\alpha = 0.05$). Obtaining a statistically significant result implies that the effect detected is important.
  • Suppose a research article indicates a $p = 0.001$ value in the results section ($\alpha = 0.05$). You have absolutely proven your alternative hypothesis (that is, you have proven that there is a difference between the population means).
  • Suppose a research article indicates a value of $p = 0.001$ in the results section ($\alpha = 0.05$). You have found the probability of the null hypothesis being true ($p = .001$).
  • Suppose a research article indicates a $p = 0.001$ value in the results section ($\alpha = 0.05$). The probability that the results of the given study are replicable is not equal to $1-p$.
  • Person A is very skeptical about homeopathy. Person B believes strongly in homeopathy. They both read a study about homeopathy, which reports a positive effect and $p < 0.05$. Person A would be more likely than Person B to conclude that ———-, and Person B would be more likely than Person A to think that ————-.
  • You perform two studies to test a potentially life-saving drug. Both studies have 80% power. What is the chance of two type 2 errors (of false negatives) in a row?
  • Study A and B are completely identical, except that all tests reported in Study A were pre-registered at a publicly available location (and the reported tests match the pre-registered tests), but all tests in Study B are not pre-registered. Both contain analyses with covariates. Based on research on flexibility in the data analysis, we can expect that on average study A will have ————; the covariate analyses are ————-.
  • When the null hypothesis is true, the probability of finding a specific p-value is ————-.
  • After finding a single statistically significant p-value we can conclude that ————-, but it would be incorrect to conclude that ————.
  • When $H_0$ is true, the probability that at least 1 out of a $X$ completely independent findings is a Type 1 error is equal to ————, this probability ———— when you look at your data and collect more data if a test is not significant.
  • It is important to have access to all (and not just statistically significant) research findings to be able to ————. A consequence of publication bias is that ———–.
  • When the difference between means is 5, and the standard deviation is 4, Cohen’s d is ————— which is ————— according to the benchmarks proposed by Cohen.
  • Suppose a research article indicates a $p = 0.30$ value in the results section ($\alpha = 0.05$). The alternative hypothesis has been shown to be false.
MCQs on Statistical Inference

Online MCQs and Test Website, gmstat.com

R Programming Language and Statistics

Partial Correlation Example

In this post, we will learn about Partial Correlation and will perform on a data as Partial Correlation Example. In multiple correlations, there are more than 2 variables, (3 variables and above) also called multivariable, in partial correlation there are 3 or more variables, partial correlation is defined as the degree of the linear relationship between any two variables, in a set of multivariable data, by keeping the effect of all other variables as a constant.

Introduction to Partial Correlation Coefficient

Like Pearson’s Correlation, Partial correlation measures the strength and direction of the relationship between two variables while controlling for (or removing the influence/effect of) one or more additional variables. It helps isolate the direct association between the two variables of interest, independent of other factors.

Suppose, you are interested in studying the correlation between exercise frequency and heart health while controlling for age, partial correlation removes the effect of age to reveal the pure relationship between exercise and heart health. Partial correlation is denoted as $r_{12.3}$, where 1 and 2 are the variables of interest, and 3 is the controlled variable.

Partial Correlation Formula

For three variables say $X_1, X_2, X_3$ then the partial correlation measures the relation between $X_1$ and $X_2$ by removing the influence of $X_3$ is the partial correlation $X_1$ and $X_2$. And is given as

$$r_{12 \cdot 3}= \frac{ r_{12} – r_{13} r_{23}} {\sqrt{(1-r_{13}^2)(1- r_{23}^2)} }$$

If we want to find the partial correlation between $X_1$ and $X_3$ then

$$r_{13\cdot 2}= \frac{ r_{13} – r_{12} r_{32}}{ \sqrt{(1- r_{12}^2)(1- r_{32}^2)}}$$

If we want to find the partial correlation between $X_2$ and $X_3$ then

$$r_{23\cdot 1}= \frac{r_{23} – r_{21} r_{31}}{\sqrt{(1- r_{21}^2)(1- r_{31}^2)}}$$

Partial Correlation Graphical Representation

Partial correlation is a statistical measure of the relationship between two variables while controlling for (excluding or eliminating) the effects of one or more additional variables. For three variables, say $X, Y,$ and $Z$ is

Partial Correlation Example

Partial Correlation is used when researchers want to determine the strength and direction of the relationship between two variables without the influence of other variables. This is particularly useful in multivariate analysis where multiple variables may be interrelated. The partial correlation coefficient ranges from $-1$ to $+1$, with $-1$ indicating a perfect negative correlation, $+1$ indicating a perfect positive correlation, and 0 indicating no correlation.

Partial Correlation Example

For the Partial Correlation Example, consider the following data with some basic computation.

$X_1$$X_2$$X_3$$X_1X_2$$X_1X_3$$X_2X_3$$X_1^2$$X_2^2$$X_3^2$
741287449161
1272842414144494
148411256321966416
179515385452898125
201282401609640014464
Total7040206173321911078354110

First compute $r_{21}, r_{13}, r_{23}, r_{12}, r_{31}$, and $r_{32}$.

\begin{align}
r_{12} &= \frac{n\Sigma (x_1 x_2 ) – (\Sigma x_1)(\Sigma x_2 )} {\sqrt{\left[n\Sigma x_1 ^2 -(\Sigma x_1)^2\right] \left[n \Sigma x_2^2 – (\Sigma x_2 )^2\right]}}\\
&= \frac{5(617)-(70)(40)} {\sqrt{\left[5 (1078)-(70)^2\right]\left[5(354)-(40)^2\right]} } = 0.987\\
r_{13} &= \frac{n\Sigma(x_1 x_3 ) – (\Sigma x_1)(\Sigma x_3 )}{\sqrt{\left[n\Sigma x_1^2 – (\Sigma x_1 )^2\right]\left[n \Sigma x_3^2 – (\Sigma x_3 )^2\right]}}\\
&= \frac{5(332)-(70)(20)}{\sqrt{\left[5 (1078)-(70)^2\right]\left[5(110)-(20)^2\right]}}= 0.959\\
r_{23} &= \frac{n\Sigma(x_2 x_3 )-(\Sigma x_2 )(\Sigma x_3 )}{\sqrt{\left[n\Sigma x_2^2 -(\Sigma x_2 )^2\right]\left[n\Sigma x_3^2 -(\Sigma x_3 )^2\right]}}\\
& = \frac{5(191)-(40)(20)}{\sqrt{\left[5(354)-40^2\right]\left[5(110)-20^2\right]}}= 0.971\\
r_{12\cdot 3} &= \frac{r_{12} – r_{13} r_{23} } {\sqrt{(1 – r_{13}^2) (1 – r_{23}^2) }}\\
& = \frac{0.987-(0.959)(0.971)} {\sqrt{(1-(0.959)^2)(1-(0.971)^2)}}\\
&=\frac{0.05659}{0.0681} = 0.8305
\end{align}

Real-Life Examples of Partial Coefficient

The following are some real-life examples of partial correlation to illustrate its application in controlling for confounding variables.

  • Exercise and Health: You may want to analyze the correlation between exercise frequency and heart health while controlling for age. It is because age can affect both exercise habits and heart health, so partial correlation removes its influence to reveal the true relationship between exercise and heart health.
  • Advertising and Sales: Suppose, you want to examine the relationship between advertising spending and sales revenue while controlling for seasonality (e.g., holiday sales). It is because seasonal factors can impact both advertising and sales, so partial correlation helps determine the direct effect of advertising on sales.
  • Education and Income: You may want to study the relationship between education level and income while controlling for work experience. It is because work experience may influence both education and income, so partial correlation helps isolate the direct relationship between education and income, independent of experience.
  • Student Performance: You want to analyze the relationship between hours spent studying and exam scores while controlling for prior academic performance. Because prior academic performance may influence both study habits and exam results, partial correlation reveals the direct effect of studying on exam scores.
  • Smoking and Lung Cancer: You are interested in studying the correlation between smoking and lung cancer risk while controlling for air pollution exposure. It is because air pollution can independently affect lung cancer risk, so partial correlation isolates the impact of smoking alone.
  • Diet and Weight Loss: You want to study the correlation between calorie intake and weight loss while controlling for physical activity levels. Because, physical activity affects both calorie intake and weight loss, so partial correlation helps isolate the direct effect of diet on weight loss.

Partial correlation is commonly used in statistical analysis, especially in fields like psychology, social sciences, and any area where multivariate relationships are analyzed. In short, partial correlation provides a clearer picture of the relationship between two variables by accounting for confounding influences.

https://rfaqs.com