Testing a Claim about a Mean Using a Large Sample: Secrets

In this post, we will learn about “Testing a claim about a Mean” using a Large sample. Before going to the main topic, we need to understand some related basics.

Hypothesis Testing

When a hypothesis test involves a claim about a population parameter (in our case mean/average), we draw a representative sample from the target population and compute the sample mean to test the claim about population. If the sample drawn is large enough ($n\ge 30$), then the Central Limit Theorem (CLT) applies, and the distribution of the sample mean is assumed to be approximately normal, that is we have $\mu_{\overline{x}} = \mu$ and $\sigma_{\overline{x}} = \frac{\sigma}{\sqrt{n}} \approx \frac{s}{\sqrt{c}}$.

Hypothesis Testing: Testing a Claim about a Mean Using a Large Sample

Testing a Claim about a Mean

It is worth noting that $s$ and $n$ are known from the sample data, and we have a good estimate of $\sigma_{\overline{x}}$ but the population mean $\mu$ is not known to us. The $\mu$ is the parameter that we are testing a claim about a mean. To have a value for $\mu$, we will always assume that the null hypothesis is true in any hypothesis test.

It is also worth noting that the null hypothesis must be of one of the following types:

  • $H_0:\mu = \mu_o$
  • $H_0:\mu \ge \mu_0$
  • $H_0:\mu \le \mu_0$

where $\mu_0$ is a constant, and we will always assume that the purpose of our test is that $\mu=mu_0$.

Standardized Test Statistic

To determine whether to reject or not reject the null hypothesis, we have two methods namely (i) a standardized value and (ii) a p-value. In both cases, it will be more convenient to convert the sample mean $\overline{x}$ to a Z-score called the standardized test statistic/score.

Since, we assumed that $\mu=\mu_0$, and we have $\mu_{\overline{x}} =\mu_0$, then the standardized statistic is:

$$Z = \frac{\overline{x} – \mu _{\overline{x}}} {\sigma_{\overline{x}} } = \frac{\overline{x} – \mu _{\overline{x}}} {\frac{s}{\sqrt{n}} }$$

As long as $\mu=\mu_0$ is assumed, the distribution standardized test statistics $Z$ is Standard Normal Distribution.

Example: Testing a Claim about an Average/ Mean

Suppose the average body temperature of a healthy person is less than the commonly accepted temperature of $98.6^{o}F$. Assume that a sample of 60 healthy persons is drawn. The average temperature of these 60 persons is $\overline{x}=98.2^oF$ and the sample standard deviation is $s=1.1^oF$.

The hypothesis of the above statement/claim would be

$H_0:\mu\ge 98.6$
$H_1:\mu < 98.6$

Note that from the alternative hypothesis, we have a left-tailed test with $\mu_0=98.6$.

Based on our sample data, the standardized test statistic is

\begin{align*}
Z &= \frac{\overline{x} – \mu _{\overline{x} } } {\frac{s}{\sqrt{n} } }\\
&=\frac{98.2 – 98.6}{\frac{1.1}{\sqrt{60}}} \approx -2.82
\end{align*}

Learn R Programming Language

Online Quiz Website

Statistical Hypotheses: Made Easy

Overview of Statistical Hypotheses

A statistical hypothesis is a claim about a population parameter. For example,

  • The mean height of males is less than 65 inches tall
  • The percentage of people favoring a bullet train is about 59%
  • The daily average expense for a college student is more than Rs. 250
  • At least 5% of Pakistan earn more than Rs 2,500,000 per year

A statistical method is used to determine if there is enough evidence in sample data to support a claim about a population.

The claimed hypotheses are written in certain statistical and concise forms. For example, the above statements about population can be written as

  • $H_0: \mu < 65$
  • $H_0: \pi = 0.59$
  • $H_0:\mu > 250$
  • $H_0: \pi \ge 0.05$

If someone is interested in knowing that above stated statistical hypotheses are either true or false, one needs to conduct a hypothesis test. To test a statistical hypothesis, one needs to follow the following basic procedure, to fulfill the requirements.

  1. Draw a random sample from the population of interest (for example, the height of males)
  2. Determine if the results from the sample data are consistent or not with the hypothesis under study.
  3. If the collected sample data is (significantly) different from the claimed hypothesis, then reject the hypothesis as being false. However, if the sampled data is not significantly different, one would not reject the hypothesis.

Statistical Hypotheses Example

Example: Suppose a battery manufacturer claims that the average life of their batteries is at least 300 minutes.

To test this hypothesis, we follow the procedure as

  1. Select a sample of say $n=100$ batteries. The sample of batteries is tested and the mean life of sampled batteries was found to be $\overline{x} = 294$ minutes with a sample standard deviation of $s=204 minutes.
  2. We need to test whether “is this data sufficiently different from the manufacturer’s claim to justify rejecting the claim as false”?
  3. Since the sample drawn is large enough, the Central Limit Theorem allows us to conclude that the distribution of sample means $\overline{x}$ is approximately normal.
  4. If the manufacturer’s claim is correct, then $\mu_{\overline{x}} = \mu \ge 300$ and we will assume that $\mu_{\overline{x}} = \mu = 300$.
  5. The Z-score will be $$Z = \frac{\overline{x} – \mu_{\overline{x}}}{\frac{s}{\sqrt{n}}}=\frac{294-300}{\frac{20}{\sqrt{100}}} = -3.0$$
  6. Search the Probability value from Standard Normal Table, as $P(\overline{x} \le 294)=0.0013$
Statistical Hypotheses

Decision about Hypothesis

Now one of the following must be true

  1. The assumption that $\mu = 300$ is incorrect
  2. The sample drawn has a so small mean that only 13 in 10,000 samples have a mean as low.

The probability of the second statement being true is quite small (0.0013). Thus there is strong evidence to believe that the first statement is true, and hence the manufacturer overstated the mean life of their batteries.

https://gmstat.com

https://rfaqs.com

Statistical Help https://itfeature.com

Important Testing of Hypothesis MCQs 8

The quiz is about Testing of Hypothesis MCQs with Answers. The quiz contains 20 questions about hypothesis testing. It covers the topics of formulation of the null and alternative hypotheses, level of significance, test statistics, region of rejection, and decision about acceptance and rejection of the hypothesis. Let us start with the Testing of Hypothesis MCQs quiz.

Online multiple choice questions about Testing of Hypothesis with Answers

1. If you reject a true null hypothesis, what does this mean?

 
 
 
 

2. If the p-value is less than alpha in a one-tail test, what conclusion can you draw?

 
 
 
 

3. A hypothesis test in which rejection of the null hypothesis occurs for values of the point estimator in either tail of the sampling distribution is called

 
 
 
 

4. What is the region of rejection for a one-tail Z test?

 
 
 
 

5. What test should a researcher use to determine whether there is evidence that the mean family income in the U.S. is greater than $30,000?

 
 
 
 

6. Which of the following statements is false?

 
 
 
 

7. In hypothesis testing, $\beta$ is

 
 
 
 

8. Which of the following does not need to be known to compute the P-value?

 
 
 
 

9. In hypothesis testing, the level of significance is

 
 
 
 

10. What determines how close the computed sample statistic has come to the hypothesized population parameter?

 
 
 
 

11. A Type II error is the error of

 
 
 
 

12. How do you commit a Type II error?

 
 
 
 

13. If a one-tail Z test for a proportion is performed and the upper critical value is +2.33 and the test statistic is equal to +1.37, then what conclusion can you draw?

 
 
 
 

14. For finding the p-value when the population standard deviation is unknown, if it is reasonable to assume that the population is normal, we use

 
 
 
 

15. When testing the following hypotheses at a level of significance

$H_o: p \le 0.7$

$H_a: p > 0.7$

The null hypothesis will be rejected if the test statistic $Z$ is

 
 
 
 

16. If the p-value is greater than alpha in a two-tail test, what conclusion should you draw?

 
 
 
 

17. In a hypothesis test, the probability of obtaining a value of the test statistic equal to or even more extreme than the value observed, given that the null hypothesis is true, is referred to as what?

 
 
 
 

18. When the null hypothesis has been true, but the sample information has resulted in the rejection of the null, a ———- has been made.

 
 
 
 

19. The maximum probability of a Type I error that the decision-maker will tolerate is called the

 
 
 
 

20. In hypothesis testing, the hypothesis which is tentatively assumed to be true is called the

 
 
 
 

Testing of Hypothesis MCQs with Answers

Hypothesis Testing procedure
  • In hypothesis testing, the hypothesis which is tentatively assumed to be true is called the
  • When the null hypothesis has been true, but the sample information has resulted in the rejection of the null, a ———- has been made.
  • The maximum probability of a Type I error that the decision-maker will tolerate is called the
  • A Type II error is the error of
  • In hypothesis testing, the level of significance is
  • For finding the p-value when the population standard deviation is unknown, if it is reasonable to assume that the population is normal, we use
  • In hypothesis testing, $\beta$ is
  • A hypothesis test in which rejection of the null hypothesis occurs for values of the point estimator in either tail of the sampling distribution is called
  • When testing the following hypotheses at a level of significance $H_o: p \le 0.7$ $H_a: p > 0.7$ The null hypothesis will be rejected if the test statistic $Z$ is
  • Which of the following does not need to be known to compute the P-value?
  • Which of the following statements is false?
  • If you reject a true null hypothesis, what does this mean?
  • How do you commit a Type II error?
  • What test should a researcher use to determine whether there is evidence that the mean family income in the U.S. is greater than $30,000?
  • In a hypothesis test, the probability of obtaining a value of the test statistic equal to or even more extreme than the value observed, given that the null hypothesis is true, is referred to as what?
  • If the p-value is greater than alpha in a two-tail test, what conclusion should you draw?
  • If the p-value is less than alpha in a one-tail test, what conclusion can you draw?
  • If a one-tail Z test for a proportion is performed and the upper critical value is +2.33 and the test statistic is equal to +1.37, then what conclusion can you draw?
  • What is the region of rejection for a one-tail Z test?
  • What determines how close the computed sample statistic has come to the hypothesized population parameter?
Testing of Hypothesis MCQs Quiz

https://rfaqs.com

https://gmstat.com

Type I and Type II Errors Examples

The post covers the Type I and Type II Errors examples.

Hypothesis testing helps us to determine whether the results are statistically significant or occurred by chance. Hypothesis testing is based on probability, therefore, there is always a chance of making the wrong decision about the null hypothesis (a hypothesis about population). It means that there are two types of errors (Type I and Type II errors) that can be made when drawing a conclusion or decision.

Errors in Statistical Decision-Making

To understand the errors in statistical decision-making, we first need to see the step-by-step process of hypothesis testing:

  1. State the null hypothesis and the alternative hypothesis.
  2. Choose a level of significance (also called type-I error).
  3. Compute the required test statistics
  4. Find the critical value or p-value
  5. Reject or fail to reject the null hypothesis.

When you decide to reject or fail to reject the null hypothesis, there are four possible outcomes–two represent correct choices, and two represent errors. You can:
• Reject the null hypothesis when it is actually true (Type-I error)
• Reject the null hypothesis when it is actually false (Correct)
• Fail to reject the null hypothesis when it is actually true (Correct)
• Fail to reject the null hypothesis when it is actually false (Type-II error)

These four possibilities can be presented in the truth table.

Type I and Type II Errors Examples

Type I and Type II Errors Examples: Clinical Trial

To understand Type I and Type II errors, consider the example from clinical trials. In clinical trials, Hypothesis tests are often used to determine whether a new medicine leads to better outcomes in patients. Imagine you are a data professional and working in a pharmaceutical company. The company invents a new medicine to treat the common cold. The company tests a random sample of 200 people with cold symptoms. Without medicine, the typical person experiences cold symptoms for 7.5 days. The average recovery time for people who take the medicine is 6.2 days.

You conduct a hypothesis test to determine if the effect of the medicine on recovery time is statistically significant, or due to chance.

In this case:

  • Your null hypothesis ($H_0$) is that the medicine has no effect.
  • Your alternative hypothesis ($H_a$) is that the medicine is effective.

Type I Error

A Type-I error (also known as a false positive) occurs when a true null hypothesis is rejected. In other words, one can conclude that the result is statistically significant when in fact the results occurred by chance. To understand this, let in your clinical trial, the results indicate that the null hypothesis is true, which means that the medicine has no effect. In case, you make a Type-I error and reject the null hypothesis, it means that you incorrectly conclude that the medicine relieves cold symptoms while the medicine was (actually) ineffective.

The probability of making a Type I error is represented by $\alpha$ (the level of significance. Typically, a 0.05 (or 5%) significance level is used. A significance level of 5% means you are willing to accept a 5% chance you are wrong when you reject the null hypothesis.

Reduce the risk of Type I error

To reduce your chances of making Type I errors, it is advised to choose a lower significance level. For example, one can choose the significance level of 1% instead of the standard 5%. It will reduce the chances of making a Type I error from 5% to 1%.

Type II Error

A Type II error occurs when we fail to reject a null hypothesis when it is false. In other words, one may conclude that the result occurred by chance, however, in fact, it didn’t. For example, in a clinical study, if the null hypothesis is false, it means that the medicine is effective. In case you make a Type II Error and fail to reject the null hypothesis, it means that you incorrectly conclude that the medicine is ineffective while in reality, the medicine relieves cold symptoms.

The probability of making a Type II error is represented by $\beta$ and it is related to the power of a hypothesis test (power = $1- \beta$). Power refers to the likelihood that a test can correctly detect a real effect when there is one.

Note that reducing the risk of making a Type I error means that it is more likely to make a Type II error or false negative.

Reduce your risk of making Type II Error

One can reduce the risk of making a Type II error by ensuring that the test has enough power. In data work, power is usually set at 0.80 or 80%. The higher the statistical power, the lower the probability of making a Type II error. To increase power, you can increase your sample size or your significance level.

Potential Risks of Type I and Type II Errors

As a data professional, it is important to be aware of the potential risks involved in making the two types of errors.

  • A Type I error means rejecting a true null hypothesis. In general, making a Type I error often leads to implementing changes that are unnecessary and ineffective, and which waste valuable time and resources.
    For example, if you make a Type I error in your clinical trial, the new medicine will be considered effective even though it is ineffective. Based on this incorrect conclusion, ineffective medication may be prescribed to a large number of people. While other treatment options may be rejected in favor of the new medicine.
  • A Type II error means failing to reject a false null hypothesis. In general, making a Type II error may result in missed opportunities for positive change and innovation. A lack of innovation can be costly for people and organizations.
    For example, if you make a Type II error in your clinical trial, the new medicine will be considered ineffective even though it’s effective. This means that a useful medication may not reach a large number of people who could benefit from it.

In summary, as a data professional, it helps to be aware of the potential errors built into hypothesis testing and how they can affect the final decisions. Depending on the certain situation, one may choose to minimize the risk of either a Type I or Type II error. Ultimately, it is the responsibility of a data professional to determine which type of error is riskier based on the goals of your analysis.

R Language Quick Reference