Testing of Hypothesis - Statistics for Data Science & Analytics

Testing a Claim about a Mean Using a Large Sample: Secrets

Sep 2, 2024Aug 22, 2024 by Muhammad Imdad Ullah

In this post, we will learn about “Testing a claim about a Mean” using a Large sample. Before going to the main topic, we need to understand some related basics.

Hypothesis Testing

When a hypothesis test involves a claim about a population parameter (in our case mean/average), we draw a representative sample from the target population and compute the sample mean to test the claim about population. If the sample drawn is large enough ($n\ge 30$), then the Central Limit Theorem (CLT) applies, and the distribution of the sample mean is assumed to be approximately normal, that is we have $\mu_{\overline{x}} = \mu$ and $\sigma_{\overline{x}} = \frac{\sigma}{\sqrt{n}} \approx \frac{s}{\sqrt{c}}$.

Testing a Claim about a Mean

It is worth noting that $s$ and $n$ are known from the sample data, and we have a good estimate of $\sigma_{\overline{x}}$ but the population mean $\mu$ is not known to us. The $\mu$ is the parameter that we are testing a claim about a mean. To have a value for $\mu$, we will always assume that the null hypothesis is true in any hypothesis test.

It is also worth noting that the null hypothesis must be of one of the following types:

$H_0:\mu = \mu_o$
$H_0:\mu \ge \mu_0$
$H_0:\mu \le \mu_0$

where $\mu_0$ is a constant, and we will always assume that the purpose of our test is that $\mu=mu_0$.

Standardized Test Statistic

To determine whether to reject or not reject the null hypothesis, we have two methods namely (i) a standardized value and (ii) a p-value. In both cases, it will be more convenient to convert the sample mean $\overline{x}$ to a Z-score called the standardized test statistic/score.

Since, we assumed that $\mu=\mu_0$, and we have $\mu_{\overline{x}} =\mu_0$, then the standardized statistic is:

$$Z = \frac{\overline{x} – \mu _{\overline{x}}} {\sigma_{\overline{x}} } = \frac{\overline{x} – \mu _{\overline{x}}} {\frac{s}{\sqrt{n}} }$$

As long as $\mu=\mu_0$ is assumed, the distribution standardized test statistics $Z$ is Standard Normal Distribution.

Example: Testing a Claim about an Average/ Mean

Suppose the average body temperature of a healthy person is less than the commonly accepted temperature of $98.6^{o}F$. Assume that a sample of 60 healthy persons is drawn. The average temperature of these 60 persons is $\overline{x}=98.2^oF$ and the sample standard deviation is $s=1.1^oF$.

The hypothesis of the above statement/claim would be

$H_0:\mu\ge 98.6$
$H_1:\mu < 98.6$

Note that from the alternative hypothesis, we have a left-tailed test with $\mu_0=98.6$.

Based on our sample data, the standardized test statistic is

\begin{align*}
Z &= \frac{\overline{x} – \mu _{\overline{x} } } {\frac{s}{\sqrt{n} } }\\
&=\frac{98.2 – 98.6}{\frac{1.1}{\sqrt{60}}} \approx -2.82
\end{align*}

Learn R Programming Language

Online Quiz Website

Statistical Hypotheses: Made Easy

Aug 1, 2024 by Muhammad Imdad Ullah

Overview of Statistical Hypotheses

A statistical hypothesis is a claim about a population parameter. For example,

The mean height of males is less than 65 inches tall
The percentage of people favoring a bullet train is about 59%
The daily average expense for a college student is more than Rs. 250
At least 5% of Pakistan earn more than Rs 2,500,000 per year

A statistical method is used to determine if there is enough evidence in sample data to support a claim about a population.

The claimed hypotheses are written in certain statistical and concise forms. For example, the above statements about population can be written as

$H_0: \mu < 65$
$H_0: \pi = 0.59$
$H_0:\mu > 250$
$H_0: \pi \ge 0.05$

If someone is interested in knowing that above stated statistical hypotheses are either true or false, one needs to conduct a hypothesis test. To test a statistical hypothesis, one needs to follow the following basic procedure, to fulfill the requirements.

Draw a random sample from the population of interest (for example, the height of males)
Determine if the results from the sample data are consistent or not with the hypothesis under study.
If the collected sample data is (significantly) different from the claimed hypothesis, then reject the hypothesis as being false. However, if the sampled data is not significantly different, one would not reject the hypothesis.

Statistical Hypotheses Example

Example: Suppose a battery manufacturer claims that the average life of their batteries is at least 300 minutes.

To test this hypothesis, we follow the procedure as

Select a sample of say $n=100$ batteries. The sample of batteries is tested and the mean life of sampled batteries was found to be $\overline{x} = 294$ minutes with a sample standard deviation of $s=204 minutes.
We need to test whether “is this data sufficiently different from the manufacturer’s claim to justify rejecting the claim as false”?
Since the sample drawn is large enough, the Central Limit Theorem allows us to conclude that the distribution of sample means $\overline{x}$ is approximately normal.
If the manufacturer’s claim is correct, then $\mu_{\overline{x}} = \mu \ge 300$ and we will assume that $\mu_{\overline{x}} = \mu = 300$.
The Z-score will be $$Z = \frac{\overline{x} – \mu_{\overline{x}}}{\frac{s}{\sqrt{n}}}=\frac{294-300}{\frac{20}{\sqrt{100}}} = -3.0$$
Search the Probability value from Standard Normal Table, as $P(\overline{x} \le 294)=0.0013$

Decision about Hypothesis

Now one of the following must be true

The assumption that $\mu = 300$ is incorrect
The sample drawn has a so small mean that only 13 in 10,000 samples have a mean as low.

The probability of the second statement being true is quite small (0.0013). Thus there is strong evidence to believe that the first statement is true, and hence the manufacturer overstated the mean life of their batteries.

https://gmstat.com

https://rfaqs.com

Important Testing of Hypothesis MCQs 8

Sep 7, 2024Jun 8, 2024 by Muhammad Imdad Ullah

The quiz is about Testing of Hypothesis MCQs with Answers. The quiz contains 20 questions about hypothesis testing. It covers the topics of formulation of the null and alternative hypotheses, level of significance, test statistics, region of rejection, and decision about acceptance and rejection of the hypothesis. Let us start with the Testing of Hypothesis MCQs quiz.

Testing of Hypothesis MCQs with Answers

In hypothesis testing, the hypothesis which is tentatively assumed to be true is called the
When the null hypothesis has been true, but the sample information has resulted in the rejection of the null, a ———- has been made.
The maximum probability of a Type I error that the decision-maker will tolerate is called the
A Type II error is the error of
In hypothesis testing, the level of significance is
For finding the p-value when the population standard deviation is unknown, if it is reasonable to assume that the population is normal, we use
In hypothesis testing, $\beta$ is
A hypothesis test in which rejection of the null hypothesis occurs for values of the point estimator in either tail of the sampling distribution is called
When testing the following hypotheses at a level of significance $H_o: p \le 0.7$ $H_a: p > 0.7$ The null hypothesis will be rejected if the test statistic $Z$ is
Which of the following does not need to be known to compute the P-value?
Which of the following statements is false?
If you reject a true null hypothesis, what does this mean?
How do you commit a Type II error?
What test should a researcher use to determine whether there is evidence that the mean family income in the U.S. is greater than $30,000?
In a hypothesis test, the probability of obtaining a value of the test statistic equal to or even more extreme than the value observed, given that the null hypothesis is true, is referred to as what?
If the p-value is greater than alpha in a two-tail test, what conclusion should you draw?
If the p-value is less than alpha in a one-tail test, what conclusion can you draw?
If a one-tail Z test for a proportion is performed and the upper critical value is +2.33 and the test statistic is equal to +1.37, then what conclusion can you draw?
What is the region of rejection for a one-tail Z test?
What determines how close the computed sample statistic has come to the hypothesized population parameter?