Hypothesis Testing MCQs 10

The quiz is about Hypothesis Testing MCQs with Answers. The quiz contains 20 questions about hypothesis testing and p-values. It covers the topics of formulation of the null and alternative hypotheses, level of significance, test statistics, region of rejection, decision, effect size, about acceptance and rejection of the hypothesis. Let us start with the Quiz Hypothesis Testing MCQs Quiz now.

Online Hypothesis Testing MCQs with Answers

Online Hypothesis Testing MCQs with Answers

1. A room in a laboratory is only considered safe if the mean radiation level is 400 or less. When a sample of 10 radiation measurements was taken, the mean value of the radiation was 414 with a standard deviation of 17. Some concerns mean radiation is above 414. Radiation levels in the lab are known to follow a normal distribution with a standard deviation of 22. We would like to conduct a hypothesis test at the 5% level of significance to determine whether there is evidence that the laboratory is unsafe. What will be the appropriate test?

 
 
 
 

2. Consider a normally distributed data set with mean $\mu = 63.18$ inches and standard deviation $\sigma = 13.27$ inches. What is the z-score when $x = 91.54$ inches?

 
 
 
 

3. Predicting that a measured variable differs in two groups, without random assignment to conditions, is often ——————.

 
 
 
 

4. Using the teacher’s rating data, is there an association between native (native English speakers) and the number of credits taught? What test will you use?

 
 
 
 

5. An experiment has been conducted to test the equality of two means, with known variances. The P-value for the Z-test statistic was 0.023. Assume a two-sided alternative hypothesis. The 95% confidence interval on the difference in the two means included the value zero.

 
 

6. What is the purpose of an ANOVA test?

 
 
 
 

7. The most important assumption in using the t-test is that the sample data come from normal populations.

 
 

8. The battery life of smartphones is of great concern to customers. A consumer group tested four brands of smartphones to determine the battery life. Samples of phones of each brand were fully charged and left to run until the battery died. The table above displays the number of hours each of the batteries lasted. What test will be used to test the difference in means?

 
 
 
 

9. You predict that your intervention will increase all participants’ performance on a test, this is an example of —————–. After the study, you conclude that the intervention only works for women but not men, this is an example of —————–.

 
 
 
 

10. Which of the following is a possible alternative hypothesis $H_1$ for a two-tailed test?

 
 
 
 

11. Which of the following statements about the ANOVA F-test score are true?

 
 
 
 

12. You perform five tests without correcting for multiple comparisons. The error rate for each test is ————–. After using the Bonferonni correction, the individual error rate for each test is —————.

 
 
 
 

13. Going through a dataset and looking at which effects are present can be problematic when —————-. It is NOT problematic when you ————–.

 
 
 
 

14. The difference between eta-squared and partial eta-squared is ————, the difference between eta-squared and omega-squared is ————–

 
 
 
 

15. You replicate an older study, which reported both credible intervals and confidence intervals. You also calculate both. Which statement is correct?

 
 
 
 

16. You performed a p-curve analysis and found a skewed distribution of p-values which peaks around $p = 0.045$, what does this mean?

 
 
 
 

17. If I wanted to test for association using a chi-square test, whether there is an association between gender (Male or Female) and tenure-ship (tenured or not tenured), what would be my degree of freedom?

 
 
 
 

18. An experiment has been performed with a factor having two levels. There are 10 observations at each level. The following data results:
$\overline{y_1} = 10.5, S_1=2, \overline{y_2}=12.4, S_2=1.6$
You conduct a test of the hypothesis that the two means are equal. Assume that the alternative hypothesis is two-sided and that the population variances are equal. The P-value is:

 
 
 
 

19. In studies with less observations, parameters like effect sizes vary ______, the power to detect the effect size in the population depends, among other things, on _____.

 

 
 
 
 

20. An example of an unstandardized effect size is ——————; unstandardized effect sizes ——————.

 
 
 
 

Online Hypothesis Testing MCQs with Answers

  • You perform five tests without correcting for multiple comparisons. The error rate for each test is ————–. After using the Bonferonni correction, the individual error rate for each test is —————.
  • An example of an unstandardized effect size is ——————; unstandardized effect sizes ——————.
  • The difference between eta-squared and partial eta-squared is ————, the difference between eta-squared and omega-squared is ————–
  • You replicate an older study, which reported both credible intervals and confidence intervals. You also calculate both. Which statement is correct?
  • In studies with less observations, parameters like effect sizes vary —————, the power to detect the effect size in the population depends, among other things, on —————–.  
  • You performed a p-curve analysis and found a skewed distribution of p-values which peaks around $p = 0.045$, what does this mean?
  • You predict that your intervention will increase all participants’ performance on a test, this is an example of —————–. After the study, you conclude that the intervention only works for women but not men, this is an example of —————–.
  • Predicting that a measured variable differs in two groups, without random assignment to conditions, is often ——————.
  • Going through a dataset and looking at which effects are present can be problematic when —————-. It is NOT problematic when you ————–.
  • What is the purpose of an ANOVA test?
  • Which of the following is a possible alternative hypothesis $H_1$ for a two-tailed test?
  • Using the teacher’s rating data, is there an association between native (native English speakers) and the number of credits taught? What test will you use?
  • If I wanted to test for association using a chi-square test, whether there is an association between gender (Male or Female) and tenure-ship (tenured or not tenured), what would be my degree of freedom?
  • Consider a normally distributed data set with mean $\mu = 63.18$ inches and standard deviation $\sigma = 13.27$ inches. What is the z-score when $x = 91.54$ inches?
  • The battery life of smartphones is of great concern to customers. A consumer group tested four brands of smartphones to determine the battery life. Samples of phones of each brand were fully charged and left to run until the battery died. The table above displays the number of hours each of the batteries lasted. What test will be used to test the difference in means?
  • A room in a laboratory is only considered safe if the mean radiation level is 400 or less. When a sample of 10 radiation measurements was taken, the mean value of the radiation was 414 with a standard deviation of 17. Some concerns mean radiation is above 414. Radiation levels in the lab are known to follow a normal distribution with a standard deviation of 22. We would like to conduct a hypothesis test at the 5% level of significance to determine whether there is evidence that the laboratory is unsafe. What will be the appropriate test?
  • Which of the following statements about the ANOVA F-test score are true?
  • An experiment has been performed with a factor having two levels. There are 10 observations at each level. The following data results: $\overline{y_1} = 10.5, S_1=2, \overline{y_2}=12.4, S_2=1.6$ You conduct a test of the hypothesis that the two means are equal. Assume that the alternative hypothesis is two-sided and that the population variances are equal. The P-value is:
  • An experiment has been conducted to test the equality of two means, with known variances. The P-value for the Z-test statistic was 0.023. Assume a two-sided alternative hypothesis. The 95% confidence interval on the difference in the two means included the value zero.
  • The most important assumption in using the t-test is that the sample data come from normal populations.

R Language and Data Analysis

Probability Distribution Quiz 8

The post is about the MCQs Probability Distributions Quiz. There are 20 multiple-choice questions about probability distributions covering distributions such as discrete and continuous Binomial Probability Distribution, Bernoulli Probability Distribution, Poisson Probability Distribution, Poisson Probability, Distribution, Geometric Probability Distribution, Hypergeometric Probability Distribution, Chi-Square distribution, Normal distribution, and F-distribution. Let us start with the MCQs Discrete Probability Distributions Quiz.

MCQs Probability Distribution Quiz

Please go to Probability Distribution Quiz 8 to view the test

Online Probability Distribution Quiz

  • You find a z-score of -1.99. Which statement(s) is/are true?
  • Expected values are properties of what?
  • If you got a 75 on a test in a class with a mean score of 85 and a standard deviation of 5, the z-score of your test score would be
  • The spread of the normal curve depends upon the value of:
  • Which of the following can best be described as a normal distribution?
  • In its standardized form, the normal distribution
  • A test is administered annually. The test has a mean score of 150 and a standard deviation 20. If Chioma’s z-score is 1.50, what was her score on the test?
  • The P-value for a normally distributed right-tailed test is P=0.042. Which of the following is INCORRECT?
  • The time X taken by a cashier in a grocery store express lane to complete a transaction follows a normal distribution with a mean of 90 seconds and a standard deviation of 20 seconds. What is the first quartile of the distribution of X (in seconds)?
  • Green sea turtles have normally distributed weights, measured in kilograms, with a mean of 134.5 and a variance of 49.0. A particular green sea turtle’s weight has a z-score of -2.4. What is the weight of this green sea turtle? Round to the nearest whole number.  
  • We look for a model, as realistic as possible, for a continuous random variable $X$ that represents the lifetime of a machine, and whose mean and variance are equal to 1 and 3, respectively. Which of the following distributions can be acceptable?
    Uniform
    Exponential
    Gamma
    Gaussian
  • The square of a Gaussian N(1, 3)
  • The distribution function of the random variable $X$ is given by $F_X(x)=1-\frac{1}{x^2}$ for $x \ge c$, 0 otherwise, where $c$ is a constant. What is the set of possible values of the constant $c$?
  • A random variable $Y$ has the following distribution y:     -1   0   1    2 p(y):  3C 2C 0.4 0.1 The value of the constant C is
  • If $Z$ has a standard normal distribution, if $U$ has a chi-square distribution with $k$ degrees of freedom and if $Z$ and $U$ are independent then the distribution of $X=\frac{Z}{\sqrt{\frac{U}{\sqrt{k}}}}$ is
  • If $X$ is a F-distributed random variable with $m$ and $n$ df, then $W=\frac{mX/n}{1+mX/n}$ has a
  • The number of parameters in multivariate normal distribution having $p$ variables are
  • The moment generating function of Gamma distribution with parameter $\lambda$ and $k$ is
  • The moment generating function of normal distribution is
  • When the experiment is repeated a variable number of times to obtain a fixed number of successes is
  • If the mean of the Chi-Square distribution is 4 then its variance is

MCQs General Knowledge

Classification in Data Mining

The post is about Classification in Data Mining. It is in the form of questions and answers for easy of understanding and learning the classification techniques and their applications in real-life.

What is Classification in Data Mining? Explain with Examples.

Classification in data mining is a supervised learning technique used to categorize data into predefined classes or labels based on input feature data. The classification technique is widely used in various applications, such as spam detection, image recognition, sentiment analysis, and medical diagnosis.

The following are some of the real life examples that make use of classification algorithms:

  • A bank loan officer may need to analyze the data to know which customers are risky or which are safe.
  • A marketing manager may need to analyze a customer with a given profile, who will buy a new product item.
  • Banks and financial institutions use classification algorithms to identify potentially fraudulent transactions by classifying them as “Fraudulent” or “Legitimate” transactions based on transaction patterns.
  • Mobile apps and digital assistants use classification algorithms to convert handwritten text into digital format by identifying and classifying individual characters or words.
  • News channels and companies use classification algorithms to categorize their articles into different sections (such as Sports, Politics, Business, Technology, etc.) based on the content of the articles.
  • Businesses analyze customer reviews, feedback, and social media posts to classify sentiments as “Positive,” “Negative,” or “Neutral,” helping them gauge public perception about their products or services.

What is the Goal of Classification?

Classification aims to develop a model that can accurately predict the class of unseen instances based on patterns learned from a training dataset.

Write about the Key Components of Classification.

Key components of classification in Data Mining are:

  1. Training Data: A dataset where the class labels are known, which will be used to train the classification model.
  2. Model: An algorithm (such as decision trees, neural networks, support vector machines, etc.) that learns to distinguish between different classes based on the training data.
  3. Features: The input variables or attributes that are used to make predictions about the class labels.
  4. Prediction: Once a model is trained, the model can classify new, unseen instances by assigning them to one of the predefined classes.
  5. Evaluation: The performance of the classification model can be assessed using metrics like accuracy, precision, F1 score, recall, and confusion matrix.

Why Classification is Needed?

In today’s world of Big Data, a large dataset is becoming a norm. For example, image a dataset/database with many terabytes such as Facebook alone crunches 4 Petabyte of data every single day. On the other hand primary challenge of big data is how to make sense of it. Moreover, the sheer volume is not the only problem. also, big data needs to be diverse, unstructured, and fast changing.

Similalry, consider the audio and video data, social media posts, 3D data or geospatial data. These kind of data are not easy to categorize or organized.

Classification in Data Mining

Name Methods of Classification Methods

The following are some population methods of classification methods.

  • Statistical procedure based approach
  • Machine Learning based approach
  • Neural network
  • Classification algorithms
  • ID3 algorithm
  • 4.5 Algorithm
  • Nearest neighbour algorithm
  • Naive bayes algorithm
  • SVM algorithm
  • ANN algorithm
  • Deision Trees
  • Support vector machine
  • Sense Clusters (an adaption of the K-means clustering algorithm)

Explain ID3 Algorithm

The ID3 (Iterative Dichotomiser 3) algorithm is a decision tree learning algorithm, primarily used for classification tasks in data mining and machine learning.

What are the Key Features of ID3 Classification?

  • Categorical Attributes: ID3 algorithm is designed to work primarily with categorical attributes. It does not handle continuous attributes directly, but they can be converted into categorical ones through binning.
  • Information Gain: The algorithm uses information gain as a criterion to select the attribute that best separates the data into different classes. Information gain measures the reduction in entropy (uncertainty) after a dataset is split based on a specific attribute.
  • Recursive Tree Building: ID3 classification algorithm builds the decision tree recursively, splitting the data into subsets based on attribute values.

MCQs Data Mining

Data Analysis in R Programming Language