Statistics for Data Science & Analytics - MCQs, Software & Data Analysis

Homoscedasticity: Constant Variance of a Random Variable (2020)

Sep 11, 2024Aug 21, 2020 by Muhammad Imdad Ullah

Post Views: 1,609

The term “Homoscedasticity” is the assumption about the random variable $u$ (error term) that its probability distribution remains the same for all observations of $X$ and in particular that the variance of each $u$ is the same for all values of the explanatory variables, i.e the variance of errors is the same across all levels of the independent variables (Homoscedasticity: assumption about the constant variance of a random variable). Symbolically it can be represented as

$$Var(u) = E\{u_i – E(u)\}^2 = E(u_i)^2 = \sigma_u^2 = \mbox(Constant)$$

This assumption is known as the assumption of homoscedasticity or the assumption of constant variance of the error term $u$’s. It means that the variation of each $u_i$ around its zero means does not depend on the values of $X$ (independent) because the error term expresses the influence on the dependent variables due to

Errors in measurement
The errors of measurement tend to be cumulative over time. It is also difficult to collect the data and check its consistency and reliability. So the variance of $u_i$ increases with increasing the values of $X$.
Omitted variables
Omitted variables from the function (regression model) tend to change in the same direction as $X$, causing an increase in the variance of the observation from the regression line.

The variance of each $u_i$ remains the same irrespective of small or large values of the explanatory variable i.e. $\sigma_u^2$ is not a function of $X_i$ i.e $\sigma_{u_i^2} \ne f(X_i)$.

Consequences if Homoscedasticity is not meet

If the assumption of homoscedastic disturbance (Constant Variance) is not fulfilled, the following are the Heteroscedasticity consequences:

We cannot apply the formula of the variance of the coefficient to conduct tests of significance and construct confidence intervals. The tests are inapplicable $Var(\hat{\beta}_0)=\sigma_u^2 \{\frac{\sum X^2}{n \sum X^2}\}$ and $Var(\hat{\beta}_1) = \sigma_u^2 \{\frac{1}{\sum X^2}\}$
If $u$ (error term) is heteroscedastic the OLS (Ordinary Least Square) estimates do not have minimum variance property in the class of Unbiased Estimators i.e. they are inefficient in small samples. Furthermore, they are inefficient in large samples (that is, asymptotically inefficient).
The coefficient estimates would still be statistically unbiased even if the $u$’s are heteroscedastic. The $\hat{\beta}$’s will have no statistical bias i.e. $E(\beta_i)=\beta_i$ (coefficient’s expected values will be equal to the true parameter value).
The prediction would be inefficient because the variance of prediction includes the variance of $u$ and of the parameter estimates which are not minimal due to the incidence of heteroscedasticity i.e. The prediction of $Y$ for a given value of $X$ based on the estimates $\hat{\beta}$’s from the original data, would have a high variance.

Tests for Homoscedasticity

Some tests commonly used for testing the assumption of homoscedasticity are:

Spearman Rank-Correlation test
Goldfeld and Quandt test
Park Glejser test
Breusch–Pagan test
Bartlett’s test of Homoscedasticity

Reference:
A. Koutsoyiannis (1972). “Theory of Econometrics”. 2nd Ed.

Conducting Statistical Models in R Language

MCQs Statistics Online Test 10

Dec 3, 2024Aug 4, 2020 by Muhammad Imdad Ullah

Post Views: 1,358

This quiz contains MCQs Statistics Online Test with answers covering variable and type of variable, Measures of central tendency such as mean, median, mode, Weighted mean, data and type of data, sources of data, Measures of Dispersion/ Variation, Standard Deviation, Variance, Range, etc. Let us start the MCQs Statistics Online Test for the preparation of the PPSC Statistics Lecturer Post.

If you found that any POSTED MCQ is/ are WRONG
PLEASE COMMENT below the MCQ with the CORRECT ANSWER and its DETAILED EXPLANATION.

Don’t forget to mention the MCQs Statement (or Screenshot), because MCQs and their answers are generated randomly

Introductory statistics deals with the measure of central tendency (that includes mean (arithmetic mean, or known as average), median, mode, weighted mean, geometric mean, and Harmonic mean) and measure of dispersion (such as range, standard deviation, and variance).

Introductory statistical methods include planning and designing the study, collecting data, arranging, and numerical and graphically summarizing the collected data. Basic statistics are also used to perform different statistical analyses to draw meaningful inferences.

A basic visual inspection of data using some graphical and also with numerical statistics may give useful hidden information that is already available in the data. The graphical representation includes a bar chart, pie chart, dot chart, box plot, etc.

Companies related to finance, communication, manufacturing, charity organizations, government institutes, simple to large businesses, etc. are all examples that have a massive interest in collecting data and measuring different sorts of statistical findings. This helps them to learn from the past, noticing the trends, and planning for the future.

MCQs Statistics Online Test

Statistics results are:
Which mean is most affected by extreme values?
The sum of absolute deviations about the median is
The sum of the square of the deviations about the mean is:
If a constant value 5 is subtracted from each observation of a set, the variance is:
Which measure of dispersion ensures the highest degree of reliability?
Which measure of dispersion is the least affected by extreme values?
Statistics are aggregates of
Data Classified by attributes are called:
Measurements usually provide:
The measures of dispersion are changed by the change of:
Cumulative frequency is
The appropriate average for calculating the average percentage increase in population is
When mean, median, and mode are identical, the distribution is:
Commodities subject to considerable price variations could best be measured by:
The extreme values in negatively skewed distribution lie in the:
A set of values is said to be relatively uniform if it has:
If each observation of a set is divided by 10, the standard deviation of the new observation is:
The Harmonic mean gives more weightage to:
The correct relationship between AM, GM, and HM is

Introduction to R Programming

Online Quizzed Website

The Z-Score Definition, Formula, Real Life Examples (2020)

May 24, 2024Jul 21, 2020 by Muhammad Imdad Ullah

Post Views: 712

Z-Score Definition: The Z-Score also referred to as standardized raw scores (or simply standard score) is a useful statistic because not only permits to computation of the probability (chances or likelihood) of the raw score (occurring within normal distribution) but also helps to compare two raw scores from different normal distributions. The Z score is a dimensionless measure since it is derived by subtracting the population mean from an individual raw score and then this difference is divided by the population standard deviation. This computational procedure is called standardizing raw score, which is often used in the Z-test of testing of hypothesis.

Any raw score can be converted to a Z-score formula by

$$Z-Score=\frac{raw score – mean}{\sigma}$$

Z-Score Real Life Examples

Example 1: If the mean = 100 and standard deviation = 10, what would be the Z-score of the following raw score

Raw Score	Z Scores
90	$ \frac{90-100}{10}=-1$
110	$ \frac{110-100}{10}=1$
70	$ \frac{70-100}{10}=-3$
100	$ \frac{100-100}{10}=0$

Note that: If Z-Score,

has a zero value then it means that the raw score is equal to the population mean.
has a positive value then it means that the raw score is above the population mean.
has a negative value then it means that the raw score is below the population mean.

The Z-Score Definition, Formula, Real Life Examples

Example 2: Suppose you got 80 marks in an Exam of a class and 70 marks in another exam of that class. You are interested in finding that in which exam you have performed better. Also, suppose that the mean and standard deviation of exam-1 are 90 and 10 and in exam-2 60 and 5 respectively. Converting both exam marks (raw scores) into the standard score, we get

$Z_1=\frac{80-90}{10} = -1$

The Z-score results ($Z_1=-1$) show that 80 marks are one standard deviation below the class mean.

$Z_2=\frac{70-60}{5}=2$

The Z-score results ($Z_2=2$) show that 70 marks are two standard deviations above the mean.

From $Z_1$ and $Z_2$ means that in the second exam, students performed well as compared to the first exam. Another way to interpret the Z score of $-1$ is that about 34.13% of the students got marks below the class average. Similarly, the Z Score of 2 implies that 47.42% of the students got marks above the class average.

Application of Z Score

Identifying Outliers: The standard score can help in identifying the outliers in a dataset. By looking for data points with very high negative or positive z-scores, one can easily flag potential outliers that might warrant further investigation.
Comparing Data Points from Different Datasets: Z-scores allow us to compare data points from different datasets because these scores are expressed in standard deviation units.
Standardizing Data for Statistical Tests: Some statistical tests require normally distributed data. The Zscore can be used to standardize data (transforming it to have a mean of 0 and a standard deviation of 1), making it suitable for such tests.

Limitation of ZScores

Assumes Normality: The Zscores are most interpretable when the data is normally distributed (a bell-shaped curve). If the data is significantly skewed, the scores might be less informative.
Sensitive to Outliers: The presence of extreme outliers can significantly impact the calculation of the mean and standard deviation, which in turn, affects the standard score of all data points.

In conclusion, z-scores are a valuable tool for understanding the relative position of a data point within its dataset. The standard score offers a standardized way to compare data points, identify outliers, and prepare data for statistical analysis. However, it is important to consider the assumptions of normality and the potential influence of outliers when interpreting the z-scores.

Read about Standard Normal Table

Visit Online MCQs Website: gmstat.com

Consequences if Homoscedasticity is not meet

Tests for Homoscedasticity

Share this:

MCQs Statistics Online Test

Share this:

Z-Score Real Life Examples

Application of Z Score

Limitation of ZScores

Read about Standard Normal Table

Share this: