Homoscedasticity: Constant Variance of a Random Variable (2020)

The term “Homoscedasticity” is the assumption about the random variable $u$ (error term) that its probability distribution remains the same for all observations of $X$ and in particular that the variance of each $u$ is the same for all values of the explanatory variables, i.e the variance of errors is the same across all levels of the independent variables (Homoscedasticity: assumption about the constant variance of a random variable). Symbolically it can be represented as

$$Var(u) = E\{u_i – E(u)\}^2 = E(u_i)^2 = \sigma_u^2 = \mbox(Constant)$$

This assumption is known as the assumption of homoscedasticity or the assumption of constant variance of the error term $u$’s. It means that the variation of each $u_i$ around its zero means does not depend on the values of $X$ (independent) because the error term expresses the influence on the dependent variables due to

  • Errors in measurement
    The errors of measurement tend to be cumulative over time. It is also difficult to collect the data and check its consistency and reliability. So the variance of $u_i$ increases with increasing the values of $X$.
  • Omitted variables
    Omitted variables from the function (regression model) tend to change in the same direction as $X$, causing an increase in the variance of the observation from the regression line.

The variance of each $u_i$ remains the same irrespective of small or large values of the explanatory variable i.e. $\sigma_u^2$ is not a function of $X_i$ i.e $\sigma_{u_i^2} \ne f(X_i)$.

Homoscedasticity

Consequences if Homoscedasticity is not meet

If the assumption of homoscedastic disturbance (Constant Variance) is not fulfilled, the following are the Heteroscedasticity consequences:

  1. We cannot apply the formula of the variance of the coefficient to conduct tests of significance and construct confidence intervals. The tests are inapplicable $Var(\hat{\beta}_0)=\sigma_u^2 \{\frac{\sum X^2}{n \sum X^2}\}$ and $Var(\hat{\beta}_1) = \sigma_u^2 \{\frac{1}{\sum X^2}\}$
  2. If $u$ (error term) is heteroscedastic the OLS (Ordinary Least Square) estimates do not have minimum variance property in the class of Unbiased Estimators i.e. they are inefficient in small samples. Furthermore, they are inefficient in large samples (that is, asymptotically inefficient).
  3. The coefficient estimates would still be statistically unbiased even if the $u$’s are heteroscedastic. The $\hat{\beta}$’s will have no statistical bias i.e. $E(\beta_i)=\beta_i$ (coefficient’s expected values will be equal to the true parameter value).
  4. The prediction would be inefficient because the variance of prediction includes the variance of $u$ and of the parameter estimates which are not minimal due to the incidence of heteroscedasticity i.e. The prediction of $Y$ for a given value of $X$ based on the estimates $\hat{\beta}$’s from the original data, would have a high variance.
Homoscedasticity

Tests for Homoscedasticity

Some tests commonly used for testing the assumption of homoscedasticity are:

Reference:
A. Koutsoyiannis (1972). “Theory of Econometrics”. 2nd Ed.

https://itfeature.com Statistics Help

Conducting Statistical Models in R Language

MCQs Statistics Online Test 10

This quiz contains MCQs Statistics Online Test with answers covering variable and type of variable, Measures of central tendency such as mean, median, mode, Weighted mean, data and type of data, sources of data, Measures of Dispersion/ Variation, Standard Deviation, Variance, Range, etc. Let us start the MCQs Statistics Online Test for the preparation of the PPSC Statistics Lecturer Post.

1. Statistics are aggregates of

 
 
 
 

2. The extreme values in negatively skewed distribution lie in the:

 
 
 
 

3. Which mean is most affected by extreme values?

 
 
 
 

4. A set of values is said to be relatively uniform if it has:

 
 
 
 

5. Which measure of dispersion ensures the highest degree of reliability?

 
 
 
 

6. The measures of dispersion are changed by the change of:

 
 
 
 

7. Measurements usually provide:

 
 
 
 

8. The correct relationship between AM, GM, and HM is

 
 
 
 

9. Statistics results are:

 
 
 
 

10. The sum of absolute deviations about the median is

 
 
 
 

11. If each observation of a set is divided by 10, the standard deviation of the new observation is:

 
 
 
 

12. Data Classified by attributes are called:

 
 
 
 

13. If a constant value 5 is subtracted from each observation of a set, the variance is:

 
 
 
 

14. When mean, median, and mode are identical, the distribution is:

 
 
 
 

15. Cumulative frequency is

 
 
 
 

16. Which measure of dispersion is the least affected by extreme values?

 
 
 
 

17. The appropriate average for calculating the average percentage increase in population is

 
 
 
 

18. Commodities subject to considerable price variations could best be measured by:

 
 
 
 

19. The sum of the square of the deviations about the mean is:

 
 
 
 

20. The Harmonic mean gives more weightage to:

 
 
 
 

If you found that any POSTED MCQ is/ are WRONG
PLEASE COMMENT below the MCQ with the CORRECT ANSWER and its DETAILED EXPLANATION.

Don’t forget to mention the MCQs Statement (or Screenshot), because MCQs and their answers are generated randomly

Introductory statistics deals with the measure of central tendency (that includes mean (arithmetic mean, or known as average), median, mode, weighted mean, geometric mean, and Harmonic mean) and measure of dispersion (such as range, standard deviation, and variance).

Introductory statistical methods include planning and designing the study, collecting data, arranging, and numerical and graphically summarizing the collected data. Basic statistics are also used to perform different statistical analyses to draw meaningful inferences.

MCQs Statistics Online Test

A basic visual inspection of data using some graphical and also with numerical statistics may give useful hidden information that is already available in the data. The graphical representation includes a bar chart, pie chart, dot chart, box plot, etc.

Companies related to finance, communication, manufacturing, charity organizations, government institutes, simple to large businesses, etc. are all examples that have a massive interest in collecting data and measuring different sorts of statistical findings. This helps them to learn from the past, noticing the trends, and planning for the future.

MCQs Statistics Online Test

  • Statistics results are:
  • Which mean is most affected by extreme values?
  • The sum of absolute deviations about the median is
  • The sum of the square of the deviations about the mean is:
  • If a constant value 5 is subtracted from each observation of a set, the variance is:
  • Which measure of dispersion ensures the highest degree of reliability?
  • Which measure of dispersion is the least affected by extreme values?
  • Statistics are aggregates of
  • Data Classified by attributes are called:
  • Measurements usually provide:
  • The measures of dispersion are changed by the change of:
  • Cumulative frequency is
  • The appropriate average for calculating the average percentage increase in population is
  • When mean, median, and mode are identical, the distribution is:
  • Commodities subject to considerable price variations could best be measured by:
  • The extreme values in negatively skewed distribution lie in the:
  • A set of values is said to be relatively uniform if it has:
  • If each observation of a set is divided by 10, the standard deviation of the new observation is:
  • The Harmonic mean gives more weightage to:
  • The correct relationship between AM, GM, and HM is

Introduction to R Programming

Online Quizzed Website

The Z-Score Definition, Formula, Real Life Examples (2020)

Z-Score Definition: The Z-Score also referred to as standardized raw scores (or simply standard score) is a useful statistic because not only permits to computation of the probability (chances or likelihood) of the raw score (occurring within normal distribution) but also helps to compare two raw scores from different normal distributions. The Z score is a dimensionless measure since it is derived by subtracting the population mean from an individual raw score and then this difference is divided by the population standard deviation. This computational procedure is called standardizing raw score, which is often used in the Z-test of testing of hypothesis.

Any raw score can be converted to a Z-score formula by

$$Z-Score=\frac{raw score – mean}{\sigma}$$

Z-Score Real Life Examples

Example 1: If the mean = 100 and standard deviation = 10, what would be the Z-score of the following raw score

Raw ScoreZ Scores
90$ \frac{90-100}{10}=-1$
110$ \frac{110-100}{10}=1$
70$ \frac{70-100}{10}=-3$
100$ \frac{100-100}{10}=0$

Note that: If Z-Score,

  • has a zero value then it means that the raw score is equal to the population mean.
  • has a positive value then it means that the raw score is above the population mean.
  • has a negative value then it means that the raw score is below the population mean.
The Z-Score Definition, Formula, Real Life Examples

Example 2: Suppose you got 80 marks in an Exam of a class and 70 marks in another exam of that class. You are interested in finding that in which exam you have performed better. Also, suppose that the mean and standard deviation of exam-1 are 90 and 10 and in exam-2 60 and 5 respectively. Converting both exam marks (raw scores) into the standard score, we get

$Z_1=\frac{80-90}{10} = -1$

The Z-score results ($Z_1=-1$) show that 80 marks are one standard deviation below the class mean.

$Z_2=\frac{70-60}{5}=2$

The Z-score results ($Z_2=2$) show that 70 marks are two standard deviations above the mean.

From $Z_1$ and $Z_2$ means that in the second exam, students performed well as compared to the first exam. Another way to interpret the Z score of $-1$ is that about 34.13% of the students got marks below the class average. Similarly, the Z Score of 2 implies that 47.42% of the students got marks above the class average.

Application of Z Score

  • Identifying Outliers: The standard score can help in identifying the outliers in a dataset. By looking for data points with very high negative or positive z-scores, one can easily flag potential outliers that might warrant further investigation.
  • Comparing Data Points from Different Datasets: Z-scores allow us to compare data points from different datasets because these scores are expressed in standard deviation units.
  • Standardizing Data for Statistical Tests: Some statistical tests require normally distributed data. The Zscore can be used to standardize data (transforming it to have a mean of 0 and a standard deviation of 1), making it suitable for such tests.

Limitation of ZScores

  • Assumes Normality: The Zscores are most interpretable when the data is normally distributed (a bell-shaped curve). If the data is significantly skewed, the scores might be less informative.
  • Sensitive to Outliers: The presence of extreme outliers can significantly impact the calculation of the mean and standard deviation, which in turn, affects the standard score of all data points.

In conclusion, z-scores are a valuable tool for understanding the relative position of a data point within its dataset. The standard score offers a standardized way to compare data points, identify outliers, and prepare data for statistical analysis. However, it is important to consider the assumptions of normality and the potential influence of outliers when interpreting the z-scores.

Read about Standard Normal Table

Visit Online MCQs Website: gmstat.com

Characteristics of Statistics (2020)

The subject of Statistics can be considered from two angles: the data itself and the field of study.

The Characteristics of Statistics as Data

  1. Statistics deals with the behavior of aggregates or large groups of data. It has nothing to do with what is happening to a particular individual or object of the aggregate.
  2. Statistics deals with aggregates of observations of the same kind rather than isolated figures.
  3. Statistics deals with variability that obscures underlying patterns. No two objects in this universe are exactly alike. If they were there would have been no statistical problem.
  4. Among the important characteristics of statistics is that statistics deals with uncertainties as every process of getting observations whether controlled or uncontrolled involves deficiencies or chance variation. That is why we have to talk in terms of probability.
  5. Statistics deals with characteristics or aspects of things that can be described numerically by counts or measurements.
  6. Statistics deals with aggregates that are subject to several random causes, e.g., the heights of persons are subject to several causes such as race, ancestry, age, diet, habits, climate, etc.
  7. Statistical laws are valid on average or in the long run. There is no guarantee that a certain law will hold in all cases. Statistical inference is therefore made in the face of uncertainty.
  8. Among the important characteristics of Statistics is that statistical results might be misleading and incorrect if sufficient care in collecting, processing, and interpreting the data is not exercised or if the statistical data are handled by someone not well-versed in the subject matter of statistics.
Characteristics of Statistics

Characteristics of Statistics as a Field:

  • Science and Art: Statistics combines aspects of both science and art. It employs scientific methods for data collection and analysis but also requires interpretation and judgment from the statistician.
  • Use of Methods and Techniques: Statistics is a discipline built on a foundation of well-defined methods and techniques for data analysis, like calculating measures of central tendency or dispersion.
  • Universally Applicable: Statistical methods have widespread applications across various fields, from social sciences and business to engineering and medicine.
  • Focus on Relationships: Statistical analysis goes beyond just summarizing data. It aims to uncover relationships, patterns, and trends within the data set.
https://itfeature.com statistics help

By understanding these characteristics of statistics, one can gain a better appreciation of the role statistics plays in various aspects of our world. It’s a discipline that helps us make sense of data, quantify uncertainty, and ultimately gain knowledge from the information we collect.

Statistics Help https://itfeature.com

See the short History of Statistics

R FAQs