How to Perform Paired Samples t test in SPSS

In this post, we will learn about performing paired samples t test in SPSS. The paired samples t-test is a statistical hypothesis testing procedure used to determine whether the mean differences between two sets of observations are zero. In paired samples t-tests (also known as dependent samples) t-tests, each observation in one set is paired with the corresponding observation in another. In this test means/averages of two related groups are compared. By related we mean that the observations in the two groups are paired or matched in some way.

Points to Remember

The following are points that need to be remembered:

A Paired samples t-test can be used when two measurements are taken from the same individuals/objects/respondents or related units. The paired measurements can be:

  • Before and After Comparisons: A comparison of before and after situations, such as measuring blood pressure before and after taking medication.
  • Matched Pairs: Used when comparing the test scores of twins or blood relations.
  • Repeated Measures: When measuring a person’s happiness level at different points in time.

A paired samples t-test is also known as a dependent samples t-test, paired samples t-test, or repeated measures t-test.

Paired Samples t-test Cannot be used

Note that a paired samples t-test can only be used to compare the means for two related (paired) units having a continuous outcome that is normally distributed. This test is not appropriate when

  • The data is unpaired
  • There are more than two units/ groups
  • The continuous outcome is not normally distribution
  • The outcome is ordinal or ranked

Hypothesis for Paired Samples t test

The hypotheses for a paired/ dependent samples t-test can be stated as

$H_0:\mu_d = 0$ (the difference between the mean of pairs is zero (or equal) )
$H_1: \mu_d \ne$ (the difference between the mean of pairs is not zero (or different) )
$H_1: \mu_d < 0$ (upper tailed test)
$H_1: \mu_d > 0$ (lower-tailed test)

The test statistics for a paired samples t-test are as follows

$$t=\frac{\overline{d} }{\frac{s_d}{\sqrt{n}} }$$

where

  • $\overline{d}$ is the sample mean of the differences
  • $n$ is the sample size
  • $s_d$ is the sample standard deviation of the differences

Performing Paired Samples t test in SPSS

To run a paired samples t test in SPSS, click Analyze > Compare Means > Paired Samples t-test.

Paired Samples t-test in SPSS Analysis Procedure

Paired samples t-test dialog box, the user needs to specify the variables to be used in the analysis. The variables from the left side need to be moved from the paired variables box. A blue button in between both boxes may be used to shift the variables from left to right or right to left side. Note that the variables you specify in paired variables pan need to be in pair form.

Paired Samples t-test in SPSS Dilog box

In the above dialog box, the following are important points to follow:

  • Pair: The pair row (on the right side pane) represents the number of paired samples t-tests to run. More than one paired samples t-test may be run simultaneously by selecting multiple sets of matched variables.
  • Variables 1: The first variable represents the first match group (such as the before situation).
  • Variables 2: The second variable represents the second match group (such as the after situation).
  • Options: The options button can be used to specify the confidence interval percentage and how the analysis will deal with the missing values.
Paired Samples t-test in SPSS Options

Note that setting the confidence interval percentage does not have any impact on the calculation of the p-value.

Paired Samples t test Data Example

Consider the following example about 20 students’ academic performance by taking an examination before and after a particular teaching methodology.

Student NumberMarks before Teaching MethodologyMarks After Teaching Methodology
11822
22125
31617
42224
51916
62429
71720
82123
92319
101820

Testing the Assumptions of Paired Samples t-test

Before performing the Paired Samples t-test, it is better to test the assumptions (or requirements) of the paired samples t-test.

  • The dependent variable should be continuous (that is measured on interval or ratio level).
  • The dependent observations (related samples) should have the same subject/ objects, that is, the subjects in the first group are also in the second group.
  • Sampled data should be random from the respective population.
  • The differences between the paired values should follow the normal (or approximately) normal distribution
  • There should be no outliers in the differences between the two related groups.

Note that when testing the assumptions (such as normality, and outliers detection) related to paired samples t-test, one must use a variable that represents the differences between the paired values, not the original variables themselves.

Also note that when one or more assumptions for a paired samples t-test are not met, you may run the non-parametric test, Wilcoxon Signed Ranks Test.

Paired Samples t-test in SPSS for analysis

Output: Paired Samples T test

The SPSS will result in four tables:

  1. Paired Samples Statistics
    The paired samples statistics table gives univariate descriptive statistics (such as mean, sample, size, standard deviation, and standard error) for each variable entered as paired variables.
  2. Paired Samples Correlations
    The paired samples correlation table gives the bivariate Person correlation coefficient for each pair of variables entered.
  3. Paired Samples Test
    The paired samples test table gives the hypothesis test results with p-value and confidence interval of difference.
  4. Paired Samples Effect Sizes
    The paired sample Effect sizes tables give Cohen’s d and Hedges’ Correction values with confidence interval
Paired Samples t-test Output in SPSS

Interpreting the Paired Samples t test Output

From the “Paired Samples Test” the two-tailed p-value (0.121) is greater than 0.05 (level of significance), which means that the null hypothesis is accepted which means that there is no difference between marks before and after the teaching methodology. It means that improvement in marks is due to chance or random variation marely. The “Paired Samples Correlations” Table shows that the paired variables are correlated/ related to each other as the p-value for Pearson’s Correlation is less than 0.05.

The marks related to before and after teaching methodology are statistically and significantly related to each other, however, the average difference of marks between before and after teaching methodology is not statistically significant. The differences are due to change or random variation.

How to report the Paired Samples t-test Results

One might report the statistics in the following format: t(degrees of freedom) = t-value, p = significance level.
From the above example, this would be: t(9) = -1.714, p > 0.05. Due to the averages of the two situations and the direction of the t-value, one can conclude that there was a statistically non-significant improvement in marks due to the teaching methodology from 19.9 ± 2.685 marks to 21.50 ± 3.922 marks (p > 0.05). So, there is no improvement due to the teaching methodology.

https://rfaqs.com, https://gmstat.com

Properties of a Good Estimator

Introduction (Properties of a Good Estimator)

The post is about a comprehensive discussion of the Properties of a Good Estimator. In statistics, an estimator is a function of sample data used to estimate an unknown population parameter. A good estimator is both efficient and unbiased. An estimator is considered as a good estimator if it satisfies the following properties:

  • Unbiasedness
  • Consistency
  • Efficiency
  • Sufficiency
  • Invariance

Let us discuss these properties of a good estimator one by one.

Unbiasedness

An estimator is said to be an unbiased estimator if its expected value (that is mean of its sampling distribution) is equal to its true population parameter value. Let $\hat{\theta}$ be an unbiased estimator of its true population parameter $\theta$ then $\hat{\theta}$. If $E(\hat{\theta}) = E(\theta)$ the estimator ($\hat{\theta}$) will be unbiased. If $E(\hat{\theta})\ne \theta$, then $\hat{\theta}$ will be a biased estimator of $\theta$.

  • If $E(\hat{\theta}) > \theta$, then $\hat{\theta}$ will be positively biased.
  • If $E(\hat{\theta}) < \theta$, then $\hat{\theta}$ will be negatively biased.

Some examples of biased or unbiased estimators are:

  • $\overline{X}$ is an unbiased estimator of $\mu$, that is, $E(\overline{X}) = \mu$
  • $\widetilde{X}$ is also an unbiased estimator when the population is normally distributed, that is, $E(\widetilde{X}) =\mu$
  • Sample variance $S^2$ is biased estimator of $\sigma^2$, that is, $E(S^2)\ne \sigma^2$
  • $\hat{p} = \frac{x}{n}$ is an unbiased estimator of $E(\hat{p})=p$

It means that if the sampling process is repeated many times and calculations about the estimator for each sample are made, the average of these estimates would be very close to the true population parameter.

An unbiased estimator does not systematically overestimate or underestimate the true parameter.

Consistency

An estimator is said to be a consistent estimator if the statistic to be used as an estimator approaches the true population parameter value by increasing the sample size. OR
An estimator $\hat{\theta}$ is called a consistent estimator of $\theta$ if the probability that $\hat{\theta}$ becomes closer and closer to $\theta$, approaches unity with increasing the sample size.

Symbolically, $\hat{\theta}$ is a consistent estimator of the parameter $\theta$ if for any arbitrary small positive quantity $e$ or $\epsilon$.

\begin{align*}
\lim\limits_{n\rightarrow \infty} P\left[|\hat{\theta}-\theta|\le \varepsilon\right] &= 1\\
\lim\limits_{n\rightarrow \infty} P\left[|\hat{\theta}-\theta|> \varepsilon\right] &= 0
\end{align*}

A consistent estimator may or may not be unbiased. The sample mean $\overline{X}=\frac{\Sigma X_i}{n}$ and sample proportion $\hat{p} = \frac{x}{n}$ are unbiased estimators of $\mu$ and $p$, respectively and are also consistent.

It means that as one collects more and more data, the estimator becomes more and more accurate in approximating the true population value.

An efficient estimator is less likely to produce extreme values, making it more reliable.

Efficiency

An unbiased estimator is said to be efficient if the variance of its sampling distribution is smaller than that of the sampling distribution of any other unbiased estimator of the same parameter. Suppose there are two unbiased estimators $T_1$ and $T_2$ of the sample parameter $\theta$, then $T_1$ will be said to be a more efficient estimator compared to the $T_2$ if $Var(T_1) < Var(T_2)$. The relative efficiency of $T_1$ compared to $T_2$ is given by the ration

$$E = \frac{Var(T_2)}{Var(T_1)} > 1$$

Note that when two estimators are biased then MSE is used to compare.

A more efficient estimator has a smaller sampling error, meaning it is less likely to deviate significantly from the true population parameter.

An efficient estimator is less likely to produce extreme values, making it more reliable.

Sufficiency

An estimator is said to be sufficient if the statistic used as an estimator utilizes all the information contained in the sample. Any statistic that is not computed from all values in the sample is not a sufficient estimator. The sample mean $\overline{X}=\frac{\Sigma X}{n}$ and sample proportion $\hat{p} = \frac{x}{n}$ are sufficient estimators of the population mean $\mu$ and population proportion $p$, respectively but the median is not a sufficient estimator because it does not use all the information contained in the sample.

A sufficient estimator provides us with maximum information as it is close to a population which is why, it also measures variability.

A sufficient estimator captures all the useful information from the data without any loss.

A sufficient estimator captures all the useful information from the data.

Invariance (Property of Love)

If the function of the parameter changes, the estimator also changes with some functional applications. This property is known as invariance.

\begin{align}
E(X-\mu)^2 &= \sigma^2 \\
\text{or } \sqrt{E(X-\mu)^2} &= \sigma\\
\text{or } [E(X-\mu)^2]^2 &= (\sigma^2)^2
\end{align}

The property states that if $\hat{\theta}$ is the MLE of $\theta$ then $\tau(\hat{\theta})$ is the MLE of $\tau(\hat{\theta})$ for any function. The Taw ($\tau$) is the general form of any function. for example $\theta=\overline{X}$, $\theta^2=\overline{X}^2$, and $\sqrt{\theta}=\sqrt{\overline{X}}$.

Properties of a Good Estimator

From the above diagrammatic representations, one can visualize the properties of a good estimator as described below.

  • Unbiasedness: The estimator should be centered around the true value.
  • Efficiency: The estimator should have a smaller spread (variance) around the true value.
  • Consistency: As the sample size increases, the estimator should become more accurate.
  • Sufficiency: The estimator should capture all relevant information from the sample.

In summary, regarding the properties of a good estimator, a good estimator is unbiased, efficient, consistent, and ideally sufficient. It should also be robust to outliers and have a low MSE.

Properties of a good estimator

https://rfaqs.com, https://gmstat.com

Probability Distributions MCQS 6

The post is about Probability distributions MCQs with Answers. There are 20 multiple-choice questions covering topics related to binomial, Poisson, exponential, normal, gamma, standard normal, hypergeometric, and bivariable distributions. Let us start with the quiz probability distributions MCQs with answers.

Online Multiple Choice Questions about Probability Distributions

1. The area under the normal curve with $\mu\pm 2\sigma$ is

 
 
 
 

2. The continuous Random variable $X$ has a gamma distribution with Parameters $\alpha$ and $\beta$. The special gamma distribution for which $\alpha = 1$ is called.

 
 
 
 

3. The moment generating function of Binomial distribution is

 
 
 
 

4. Standard normal probability distribution has a mean equal to 40, whereas the value of random variable x is 80 and the z-statistic is equal to 1.8, the standard deviation of the standard normal probability distribution is

 
 
 
 

5. Let $X$ be a positive random variable and let a new random variable $Y$ be defined as $Y=log X$. If $Y$ has a normal distribution then $X$ is

 
 
 
 

6. Consider probability distribution as standard normal, if the value of $\mu$ is 75, the value of $x$ is 120 with an unknown standard deviation of distribution then the value of z-statistic

 
 
 
 

7. A bank received 2600 applications for a home mortgage. The probability of approval is 0.78 then the standard deviation of the binomial probability distribution is

 
 
 
 

8. If the value of $p$ is smaller or lesser than 0.5 then binomial distribution is classified as

 
 
 
 

9. If the value of $x$ is less than $\mu$ of standard normal probability distribution then the

 
 
 
 

10. For a normal distribution, the measure of kurtosis equals to

 
 
 
 

11. The normal distribution will be less spread out when

 
 
 
 

12. If in a Gamma density, $k=1$ the Gamma density becomes

 
 
 
 

13. The number of products manufactured in a factory in a day is 3500 and the probability that some pieces are defective is 0.55 then the mean of the binomial probability distribution is

 
 
 
 

14. Which of the following is NOT an assumption of the Binomial distribution?

 
 
 
 

15. A bivariate normal distribution has a number of parameters in it

 
 
 
 

16. If $Mean = Variance$ the distribution is called

 
 
 
 

17. The continuous random variable $X$ has a gamma distribution with parameters $\alpha$ and $\beta$. The special gamma distribution for which $\alpha=\frac{v}{2}$ and $\beta=2$ where $v$ is a +ve integer is called.

 
 
 
 

18. In binomial distribution, the formula for calculating standard deviation is

 
 
 
 

19. If $Mean > Variance$ then the distribution is

 
 
 
 

20. The maximum ordinate of the normal curve is at

 
 
 
 

Probability Distributions MCQS with Answers

Probability Distributions MCQs Quiz with Answers
  • The number of products manufactured in a factory in a day is 3500 and the probability that some pieces are defective is 0.55 then the mean of the binomial probability distribution is
  • A bank received 2600 applications for a home mortgage. The probability of approval is 0.78 then the standard deviation of the binomial probability distribution is
  • Consider probability distribution as standard normal, if the value of $\mu$ is 75, the value of $x$ is 120 with an unknown standard deviation of distribution then the value of z-statistic
  • If the value of $x$ is less than $\mu$ of standard normal probability distribution then the
  • Standard normal probability distribution has a mean equal to 40, whereas the value of random variable x is 80 and the z-statistic is equal to 1.8, the standard deviation of the standard normal probability distribution is
  • In binomial distribution, the formula for calculating standard deviation is
  • If the value of $p$ is smaller or lesser than 0.5 then binomial distribution is classified as
  • The continuous Random variable $X$ has a gamma distribution with Parameters $\alpha$ and $\beta$. The special gamma distribution for which $\alpha = 1$ is called.
  • The continuous random variable $X$ has a gamma distribution with parameters $\alpha$ and $\beta$. The special gamma distribution for which $\alpha=\frac{v}{2}$ and $\beta=2$ where $v$ is a +ve integer is called.
  • For a normal distribution, the measure of kurtosis equals to
  • A bivariate normal distribution has a number of parameters in it
  • The moment generating function of Binomial distribution is
  • If $Mean > Variance$ then the distribution is
  • If $Mean = Variance$ the distribution is called
  • The area under the normal curve with $\mu\pm 2\sigma$ is
  • The maximum ordinate of the normal curve is at
  • Which of the following is NOT an assumption of the Binomial distribution?
  • The normal distribution will be less spread out when
  • Let $X$ be a positive random variable and let a new random variable $Y$ be defined as $Y=log X$. If $Y$ has a normal distribution then $X$ is
  • If in a Gamma density, $k=1$ the Gamma density becomes
https://itfeature.com probability distributions mcqs with answers

https://gmstat.com, https://rfaqs.com

Counting Techniques in Probability Statistics

The counting techniques in probability, statistics, mathematics, engineering, and computer science are essential tools. Counting Techniques in probability help in determining the number of ways a particular event can occur.

The following are the most common counting techniques in probability theory:

Factorial

For any integer $n$, $n$ factorial (denoted by $n!$) is the descending product beginning with $n$ and ending with 1. It can be written as

$$n! = n\times (n-1) \times (n-2) \times \cdots \times 2 \times 1$$

The example of factorial counting are:

  • $3! = 3\times 2\times 1 = 6$
  • $5! = 5\times 4\times 3! = 20 \times 6 = 120$
  • $10! = 10\times 9\times 8\times 7\times 6\times 5! = 3628800$

Note that a special definition is made for the case of $0!$, $0!=1$.

Permutations

A permutation of a group of objects is an ordered arrangement of the objects. The number of different permutations of a group of $n$ objects is $n!$. The formula of permutation is

$$P(n, r) = {}^nP_r = \binom{n}{r} = \frac{n!}{(n-r)!}$$

where $n$ is the total number of objects, and $r$ is the number of objects to be arranged.

The example of permutations are:

  • The number of ways of dealing with the cards of a standard deck in some order is $52! = 8.066\times 10^{67}$
  • Suppose, we want to place a set of five names in some order, there are five choices for which name to place first, then 4 choices of which to list second, 3 choices for third, 2 choices for fourth, and only one choice for the last (fifth one). Therefore, one can determine, how many different ways can 5 people be ordered in a row can be counted using the fundamental counting principle, the number of different ways to put 5 names in order is $5! = 5\times 4 \times 3\times 2\times 1 = 120$

Often entire set of objects is not required to be placed in order, usually one wants to compute how many ways a few chosen objects can be ordered. For example,

Example: A horse race has 14 horses, how many different possible ways can the top 3 horses finish?
Solution: There are 14 possibilities for which horse finishes first, 13 for second, and 12 for third. So, by the fundamental counting principle, there are $14\times 13\times 12 = 2184$ different possible ways (finishing orders) for the top three horses. $\binom{14}{3} = \frac{14!}{(14-3)!}=2184$.

In the above examples, permutations are called permutations of $n$ objects taken $r$ at a time.

Combinations

A combination is a selection of objects from a set without regard to order. Combinations are used when calculating the number of outcomes for experiments involving multiple choices, and often the order of the choices is not required. The formula of combination is

$$C(n, r) = {}^nC_r = \frac{n!}{r!(n-r)!}=\frac{{}^nP_r}{r!}$$

where $n$ is the total number of objects, and $r$ is the number of objects to be arranged without any regard to importance or order.

The example of combinations are:

  • Drawing a 5-card poker hand (${}^{52}C_5$)
  • Selecting a three-person committee from a group of 30 (without any priority or importance) (${}^{30}C_3$)

A choice of $r$ objects from a group of $n$ objects without regard to order is called a combination of $n$ objects taken $r$ at a time.

Example: In how many different ways can a committee of 3 people be chosen from a group of 10 people?

Solution: $C(10, 3) = \frac{10!}{3!(10-3)!} = 120 ways$

Counting Techniques in Probability

Multiplication Principle

If one event can occur $m$ times and another event occurs $n$ times, then the occurrence of the two events together can be computed using the multiplication principle, that is, by multiplying $m\times n$. For example, if there are 5 shirts and 3 pants to choose from, one can compute the different ways of outfits by multiplying the number of shirts and number of pants, i.e., $5 \times 3=15$, so there are 15 ways of outfits from 5 shirts and 3 pants.

Addition Principle

If one event can occur in $m$ ways and a second event can occur n $n$ ways, then one or the other event can occur in $m+n$ ways. For example, if there are 3 red balls and 4 blue balls, the number of was a ball can be chosen is: $4+3=7$.

Application of Counting Techniques in Probability

  • Probability: Calculating probabilities of events based on the number of favorable outcomes and the total number of possible outcomes.
  • Combinatorics: Studying the arrangement, combination, or selection of objects.
  • Computer Science: Analyzing algorithms and data structures.
  • Statistics: Sampling and hypothesis testing.
  • Cryptography: Designing secure encryption methods

FAQs about Counting Techniques in Probability

  1. What is meant by counting techniques?
  2. What are the applications of counting techniques in probability?
  3. Define permutations and combinations.
  4. What is the difference between the multiplication and addition principles?
  5. Give real-life examples of permutations and combinations.
  6. Write down the formulas of permutations and combinations.
https://itfeature.com counting Techniques in probability

https://gmstat.com, https://rfaqs.com