Inferential Statistics Terminology

This post is about Inferential Statistics (or statistical inference) and some of its related terminologies. This is a field of statistics that allows us to understand and make predictions about the world around us.

Parameter and Statistic

Any measurable characteristic of a population is called a parameter. For example, the mean of a population is a parameter. OR

Numerical values that describe the characteristics of a whole population are called parameters, commonly presented in Greek Letters.

Any measurable characteristic of a sample is called a statistic. For example, the mean of a sample is a statistic. OR

Numerical measures describing the characteristics of a sample are called statistics, represented by Roman Letters.

Population and Sample

Population: The entire group of individuals, objects, or data points that one is interested in studying. A population under study can be finite or infinite. However, often too large or impractical to study directly.

Sample: A smaller, representative subset of the population. It is used to gain insights about the population without having to study every member. A sample should accurately reflect the characteristics of the population.  

Inference

A Process of drawing conclusions about a population based on the information contained in a sample taken from that population

Estimator

An estimator is a rule (method, formula) that tells how to calculate the value of an estimate based on the measurements contained in a sample. The sample mean is one possible estimator of the population mean $\mu$.

An estimator will be a good estimator in the sense that the distribution of an estimator is concentrated near the value of the parameter.

Estimate

Estimate is a way to use samples. There are many ways to estimate a parameter. Estimates are near to reality (biased or crude). Decisions are very accurate if the estimate is near to reality.

$X_1, X_2, \cdots, X_n$ is a sample and $\overline{X}$ is an estimator. $x_1, x_2, \cdots, x_n$ are sample observation and $\overline{x}=\frac{\Sigma x_i}{n}$ is an estimate.

Estimation

Estimation is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable.

Statistical Inference (or Inferential Statistics)

Any process (art) of drawing inferences (conclusions) about the population based on limited information contained in a sample taken from the same population is called statistical inference (or inferential statistics). It is difficult to draw an inference about the population because the study of the entire universe (population) is not simple. To get some idea about the characteristics (parameters) of the population, we choose a part of a reasonable size, generally, referred to as a sample (by some appropriate method).

Statistical inference is a powerful set of tools used to conclude a population based on data collected from a sample of that population. It allows us to make informed decisions and predictions about the larger group even when we have not examined every single member.

Why Estimate?

  • Speed: Often, an estimate is faster to get than an exact calculation.
  • Simplicity: It can simplify complex problems.
  • Decision-Making: Estimates help one to make choices when one does not have all the details.
  • Checking: One can use estimates to check if a more precise answer is reasonable.

Why is Statistical Inference Important?

  • Decision-making: It helps us make informed decisions in various fields, such as medicine, business, and social sciences.
  • Research: It is crucial for conducting research and drawing meaningful conclusions from data.
  • Understanding the World: It allows us to understand and make predictions about the world around us.
Inferential Statistics or Statistical Inference

Learn R Programming Language, Learn Statistics and Data Analysis

Solved Probability Questions with Answers

This post is about some solved probability questions. These questions make use of (i) the Addition Law of Probabilities, and (ii) the Multiplication Law of Probabilities.

Solved Probability Questions

Question 1: Box A contains 5 Green and 7 Red balls. Box B contains 3 Green, 3 Red, and 6 Yellow balls. A box is selected at random, and a ball is drawn at random from it. What is the probability that the bill drawn is green?

Solution:

Box A

Total Balls: 5 + 7 = 12
Prob(Green) = $\frac{3}{12}$

Box B

Total Balls: 3 + 3 + 6 = 12
P(Green) = $\frac{3}{12} = \frac{1}{4}$

$$P(A+B) = P(A) + P(B) = \frac{5}{12} + \frac{3}{12} = \frac{8}{12} = \frac{2}{3}$$

Question 2: A pair of fair dice is thrown twice. What is the probability of getting a total of 5 or 11?

Solution:

\begin{align*}
P(X = 11 \,\, or X = 5) &= P(X=11) + P(X=15) – P(X=11\,\,and\,\, X=5)\\
P(X=11) &= \frac{2}{36}\\
P(X=5) &= \frac{4}{36}=\frac{1}{9}\\
P(X=11\,\, and X=5) &= 0
\end{align*}

Therefore,

\begin{align*}
P(X=11\,\, or X=5) &= P(X=11) + P(X=5) \\
&=\frac{2}{36} + \frac{1}{9} = \frac{1}{6}
\end{align*}

Note that $P(X=11\,\, and X=5) = 0$, because the sum of two dice cannot be at the same time 5 and 11.

Question 3: A marble is drawn at random from a box containing 10 red, 30 white, 20 blue, and 15 orange marbles. What is the probability that it is (i) orange or red (ii) not red or blue (iii) not blue, (iv) white, (v) red, white, or blue.

Solution:

Total number of balls = 10 + 30 + 20 + 15 = 75
Number of Orange balls = 15
Number of Blue balls = 20
Number of White balls = 30
Number of Red balls = 10

  1. P(a marble drawn is red or orange) = P(Red marble) + P(Orange marble)
    $$=\frac{10}{75} + \frac{15}{75} = \frac{1}{3}$$
  2. P(a marble drawn is not red or blue) = P(not Red) + P(Blue) – P(Blue and not Red)
    $$=\frac{65}{75} + \frac{20}{75} – \frac{20}{75} = \frac{65}{75}$$
  3. P(a ball drawn is not Blue) = $1 – P(Blue) = 1 – \frac{20}{75} = 0.733$
  4. P(a ball drawn is white) = $\frac{30}{75}$
  5. P(a ball drawn is Red, White, or Blue) = P(Red) + P(White) + P(Blue)
    $$=\frac{10}{75} + \frac{30}{75} + \frac{20}{75} = \frac{60}{75}$$

Question 4: If two dice are thrown what are the various total number of dots that may turn up? What are the probabilities of each of them? What is the probability that the number of dots will total at least four?

Solution:

When two dice are thrown together, the minimum total number of dots is 2 (1, 1), and the maximum dots possible are 12 (6, 6). Therefore

  • Probability of 2 dots (1, 1) = $\frac{1}{36}$
  • Probability of 3 dots {(2, 1), (1, 2)} = $\frac{2}{36} = \frac{1}{18}$
  • Probability of 4 dots {(2,2) (3,1) (1,3)} = $\frac{3}{36} = \frac{1}{12}$
  • Probability of 5 dots {(4,1) (1,4) (2,3) (3,2)} = $\frac{4}{36} = \frac{1}{9}$
  • {Probability of 6 dots {(3,3) (4,2) (2,4) (5,1) (5,1)} = $\frac{5}{36}$
  • Probability of 7 dots {(4,3) (3,4) (5,2) (2,5) (6,1) (1,6)} = $\frac{6}{36} = \frac{1}{6}$
  • Probability of 8 dots {(6,2) (2,6) (5,3) (3,5) (4,4)} = $\frac{5}{36}$
  • Probability of 9 {(5,4) (4,5) (6,3) (3,6)} dots = $\frac{4}{36} = \frac{1}{9}$
  • Probability of 10 dots {(5,5) (6,4) (4,6)} = $\frac{3}{36} = \frac{1}{2}$
  • Probability of 11 dots {(5,6) (6,5)} = \frac{2}{36} = \frac{1}{18}$
  • Probability of 12 dots {(6,6)} = $\frac{1}{36}$
  • Probability that the number of dots will total at least 4 = $\frac{33}{36}$

Question 5: A one card is selected at random from a deck of 52 playing cards. What is the probability that the card is a club or a face card or both?

Solution:

\begin{align*}
P(club\,\, or\,\, face\,\, or\,\, both) &= P(club) + P(face) – P(club\,\, and\,\, face)\\
&=\frac{13}{52} + \frac{12}{52} – \frac{3}{52} = \frac{11}{26}
\end{align*}

Question 6: A class contains 10 men and 20 women of which half men and half women have brown eyes. What is the probability that a person chosen at random is a man or has brown eyes?

Solution:

Let $A$ be the event that it is a man (10 out of 30)
Let $B$ be the event that the person has brown eyes (5 men and 10 women: 15 out of 30)

$P(A\cap B)$ is a man AND has brown eyes $\frac{5}{30}$

\begin{align*}
P(A \cup B) &= P(A) + P(B) – P(A \cap B)\\
&= \frac{10}{30} + \frac{15}{30} – \frac{5}{30} = \frac{2}{3}
\end{align*}

Question 7: A drawer contains 50 bolts and 150 nuts. Half of the bolts and half of the nuts are rested. If one item is chosen at random, what is the probability that it is rusted or is a bolt?

Solution:

Number of Bolts = 50
NUmber of Nuts = 150
Total number of Items = 50 + 150 = 200

Item chosen is rusted: $P(A) = \frac{100}{200} = \frac{1}{2}$
Item chosen is bolt: $P(B) = \frac{50}{200} = \frac{1}{4}$
Ite is Rusted and Bolt = $P(A\cap B) = P(A) \cdot P(B) = \frac{1}{2}\cdot \frac{1}{4} = \frac{1}{8}$

\begin{align*}
P(A \cup B) &= P(A) + P(B) – P(A\cap B) \\
&= \frac{1}{2} + \frac{1}{4} – \frac{1}{8} = \frac{5}{8}
\end{align*}

Solved Probability Questions with Answers

Learn R Programming, Computer MCQs Online Test

MCQs Basic Statistics Quiz 18

This post is about the MCQs Basic Statistics Quiz with Answers. There are 20 multiple-choice questions about the Basics of Statistics, covering measures of central tendency (Mean, Median, Mode, Geometric Mean, and Harmonic Mean), Measures of Dispersion, Deviations, Relationships between different measures of central tendency, Coding Methods for computing Mean, etc. Let us start with the MCQs Basic Statistics Quiz.

Online MCQs about Basic Statistics

1. The mean of the $n$ natural numbers is

 
 
 
 

2. One type of average is

 
 
 
 

3. The appropriate average for calculating the average percentage increase in population is

 
 
 
 

4. If 4 is added to all observations in the data then the mean increases by

 
 
 
 

5. The sum of absolute deviations of the values is least when deviations are taken from

 
 
 
 

6. One type of average is

 
 
 
 

7. The arithmetic mean is affected by

 
 
 
 

8. If the mean is greater than the mode, the distribution is

 
 
 
 

9. The coding method is used for calculating

 
 
 
 

10. The relation between AM, GM, and HM is

 
 
 
 

11. If $\Sigma (x – 12) = 0$ then $x=$

 
 
 
 

12. The Geometric Mean of $a$ and $b$ is

 
 
 
 

13. The most central value of an array is called

 
 
 
 

14. In the case of an Open-end frequency table, the average cannot be computed accurately

 
 
 
 

15. The coding method is used for calculating

 
 
 
 

16. The value obtained by dividing the sum of the values by their number is called

 
 
 
 

17. The arithmetic mean of 5, 9, 12, 15 is

 
 
 
 

18. The sum of squares of deviations of the values is least when deviations are taken from

 
 
 
 

19. The arithmetic mean of 112, 120, 135, 150, 157 is

 
 
 
 

20. The sum of deviations of the values from their means is

 
 
 
 

MCQs Basic Statistics Quiz

  • One type of average is
  • The most central value of an array is called
  • In the case of an Open-end frequency table, the average cannot be computed accurately
  • One type of average is
  • The value obtained by dividing the sum of the values by their number is called
  • The arithmetic mean of 5, 9, 12, 15 is
  • The arithmetic mean of 112, 120, 135, 150, 157 is
  • The appropriate average for calculating the average percentage increase in population is
  • The arithmetic mean is affected by
  • The mean of the $n$ natural numbers is
  • The sum of deviations of the values from their means is
  • The sum of squares of deviations of the values is least when deviations are taken from
  • If the mean is greater than the mode, the distribution is
  • If $\Sigma (x – 12) = 0$ then $x=$
  • The Geometric Mean of $a$ and $b$ is
  • The sum of absolute deviations of the values is least when deviations are taken from
  • The coding method is used for calculating
  • The coding method is used for calculating
  • The relation between AM, GM, and HM is
  • If 4 is added to all observations in the data then the mean increases by
  • One type of average is
  • The most central value of an array is called
  • In the case of an Open-end frequency table, the average cannot be computed accurately
  • One type of average is
  • The value obtained by dividing the sum of the values by their number is called
  • The arithmetic mean of 5, 9, 12, 15 is
  • The arithmetic mean of 112, 120, 135, 150, 157 is
  • The appropriate average for calculating the average percentage increase in population is
  • The arithmetic mean is affected by
  • The mean of the $n$ natural numbers is
  • The sum of deviations of the values from their means is
  • The sum of squares of deviations of the values is least when deviations are taken from
  • If the mean is greater than the mode, the distribution is
  • If $\Sigma (x – 12) = 0$ then $x=$
  • The Geometric Mean of $a$ and $b$ is
  • The sum of absolute deviations of the values is least when deviations are taken from
  • The coding method is used for calculating
  • The coding method is used for calculating
  • The relation between AM, GM, and HM is
  • If 4 is added to all observations in the data then the mean increases by
Online MCQs Basic Statistics Quiz with Answers

General Knowledge Quiz, Data Analysis in R Language

MCQs Discrete Probability Distributions 7

The post is about MCQs Discrete Probability Distributions. There are 20 multiple-choice questions about discrete probability distributions covering distributions such as Binomial Probability Distribution, Bernoulli Probability Distribution, Poisson Probability Distribution, Poisson Probability, Distribution, Geometric Probability Distribution, and Hypergeometric Probability Distribution. Let us start with the MCQs Discrete Probability Distributions Quiz.

Please go to MCQs Discrete Probability Distributions 7 to view the test

MCQs Discrete Probability Distributions

  • For a binomial distribution which of the following is true
  • The number of possible outcomes in a Bernoulli trial is
  • The mean and mode of the Binomial distribution are equal if
  • The hypergeometric random variable is a
  • The parameters of hypergeometric distribution are
  • The probability of success changes from trial to trial in
  • The probability of success does not change from trial to trial in
  • The successive trials are without replacement in
  • Which of the following could never be described by the Binomial distribution?
  • If $X$ is the number of trials for the negative binomial distribution with parameters $p$ and $k$ then its minimum value is
  • For a given binomial distribution with $n$ fixed if $p=0.5$ then
  • The necessary and sufficient condition of the hypergeometric distribution is
  • Which of the following is the most reasonable condition for the binomial approximation to the hypergeometric distribution?
  • Suppose, we have a Poisson distribution with $\lambda$ equal to 2 then the probability of having exactly 10 occurrences is
  • Which of the following is a characteristic of the probability distribution for any random variable
  • In what case would the Poisson distribution be a good approximation of the binomial distribution
  • The mode of the geometric distribution is
  • The binomial distribution may be approximated by a Poisson distribution if
  • In a Binomial distribution, if $n$ is the number of trials and $p$ is the probability of success, then the mean value is given by
  • In a binomial probability distribution, the sum of the probability of failure and the probability of success is always
MCQs Discrete Probability Distributions

Learn R Language, SPSS Data Analysis