Errors in Statistics: A Comprehensive Guide

To learn about errors in statistics, we first need to understand the concepts related to true value, accuracy, and precision. Let us start with these basic concepts.

True Value

The true value is the value that would be obtained if no errors were made in any way by obtaining the information or computing the characteristics of the population under study.

The true value of the population is possible obtained only if the exact procedures are used for collecting the correct data, every element of the population has been covered and no mistake or even the slightest negligence has happened during the data collection process and its analysis. It is usually regarded as an unknown constant.

Accuracy

Accuracy refers to the difference between the sample result and the true value. The smaller the difference the greater will be the accuracy. Accuracy can be increased by

  • Elimination of technical errors
  • Increasing the sample size

Precision

Precision refers to how closely we can reproduce, from a sample, the results that would be obtained if a complete count (census) was taken using the same method of measurement.

Errors in Statistics

The difference between an estimated value and the population’s true value is called an error. Since a sample estimate is used to describe a characteristic of a population, a sample being only a part of the population cannot provide a perfect representation of the population (no matter how carefully the sample is selected). Generally, it is seen that an estimate is rarely equal to the true value and we may think about how close will the sample estimate be to the population’s true value. There are two kinds of errors, sampling and non-sampling errors.

  • Sampling error (random error)
  • Non-sampling errors (nonrandom errors)

Sampling Errors

A sampling error is the difference between the value of a statistic obtained from an observed random sample and the value of the corresponding population parameter being estimated. Sampling errors occur due to the natural variability between samples. Let $T$ be the sample statistic and it is used to estimate the population parameter $\theta$. The sampling error may be denoted by $E$,

$$E=T-\theta$$

The value of the sampling error reveals the precision of the estimate. The smaller the sampling error, the greater will be the precision of the estimate. The sampling error may be reduced by some of the following listed:

  • By increasing the sample size
  • By improving the sampling design
  • By using the supplementary information

Usually, sampling error arises when a sample is selected from a larger population to make inferences about the whole population.

Errors in Statistics, Sampling Error

Non-Sampling Errors

The errors that are caused by sampling the wrong population of interest and by response bias as well as those made by an investigator in collecting, analyzing, and reporting data are all classified as non-sampling errors (or non-random errors). These errors are present in a complete census as well as in a sampling survey.

Bias

Bias is the difference between the expected value of a statistic and the true value of the parameter being estimated. Let $T$ be the sample statistic used to estimate the population parameter $\theta$, then the amount of bias is

$$Bias = E(T) – \theta$$

The bias is positive if $E(T)>\theta$, bias is negative if $E(T) <\theta$, and bias is zero if $E(T)=\theta$. The bias is a systematic component of error that refers to the long-run tendency of the sample statistic to differ from the parameter in a particular direction. Bias is cumulative and increases with the increase in size of the sample. If proper methods of selection of units in a sample are not followed, the sample result will not be free from bias.

Note that non-sampling errors can be difficult to identify and quantify, therefore, the presence of non-sampling errors can significantly impact the accuracy of statistical results. By understanding and addressing these errors, researchers can improve the reliability and validity of their statistical findings.

Errors in Statistics: Potential Sources of Error

https://rfaqs.com, https://gmstat.com

Leave a Comment

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading