# Category: Basic Statistics

## Errors of Measurement

It is the fact and from experience, it is observed that a continuous variable can not be measured with perfect (true) value because of certain habits and practices, measurement methods (techniques), and instruments (or devices) used, etc. It means that the measurements are thus always recorded correctly to the nearest units and hence are of limited accuracy. The actual or true values are, however, assumed to exist.

For example, if the weight of a student is recorded as 60 kg (correct to the nearest kilogram), his/her true (actual) weight, in fact, may lie between 59.5 kg and 60.5 kg. The weight recorded as 60.00 kg for that student means the true weight is known to lie between 59.995 and 60.005 kg. Thus, there is a difference, however, it is small it may be between the measured value and the true value. This sort of departure from the true value is technically known as the error of measurement. In other words, if the observed value and the true value of a variable are denoted by $x$ and $x + \varepsilon$, respectively, then the difference $(x + \varepsilon) – x=\varepsilon$, is the error. This error involves the unit of measurement of $x$ and is, therefore, called an absolute error.

An absolute error divided by the true value is called the relative error. Thus the relative error can be measured as $\frac{\varepsilon}{x+\varepsilon}$. Multiplying this relative error by 100 gives the percentage error. These errors are independent of the units of measurement of $x$. It ought to be noted that an error has both magnitude and direction and that the word error in statistics does not mean mistake which is a chance inaccuracy.

An error is said to be biased when the observed value is higher or lower than the true value. Biased errors arise from the personal limitations of the observer, the imperfection in the instruments used, or some other conditions which control the measurements. These errors are not revealed by repeating the measurements. They are cumulative in nature, that is, the greater the number of measurements, the greater would be the magnitude of the error. They are thus more troublesome. These errors are also called cumulative or systematic errors.

An error, on the other hand, is said to be unbiased when the deviations from the true value tend to occur equally often. Unbiased errors tend to cancel out in the long run. These errors are therefore compensating and are also known as random errors or accidental errors.

We can reduce errors of measurement by

• Double-checking all measurements for accuracy
• Double-checking the formulas are correct
• Making sure observers and measurement takers are well trained
• Making the measurement with the instrument has the highest precision
• Take the measurements under controlled conditions
• Pilot test your measuring instruments
• Use multiple measures for the same construct

## Quantitative and Qualitative Variables: Data

The word “data” is used in many contexts and is also used in ordinary conversations frequently. Data is Latin for “those that are given” (the singular form is “datum”). Data may therefore be thought of as the results of observation. In this post, we will see about quantitative and qualitative variables too.

Data are collected in many aspects of everyday life.

• Statements given to a police officer or physician or psychologist during an interview are data.
• So are the correct and incorrect answers given by a student on a final examination.
• Almost any athletic event produces data.
• The time required by a runner to complete a marathon,
• The number of spelling errors committed by a computer operator in typing a letter.

Data are also obtained in the course of scientific inquiry:

• the positions of artifacts and fossils in an archaeological site,
• The number of interactions between two members of an animal colony during a period of observation,
• The spectral composition of light emitted by a star.

Data comprise variables. Variables are something that changes from time to time, place to place, and/or person to person. Variables may be classified into quantitative and qualitative according to the form of the characters they may have.

A variable is called a quantitative variable when a characteristic can be expressed numerically such as age, weight, income, or a number of children, that is, the variables that can be quantified or measured from some measurement device/ scales (such as weighing machine, thermometer, and liquid measurement standardized container).

On the other hand, if the characteristic is non-numerical such as education, sex, eye color, quality, intelligence, poverty, satisfaction, etc. the variable is referred to as a qualitative variable. A qualitative characteristic is also called an attribute. An individual or an object with such a characteristic can be counted or enumerated after having been assigned to one of the several mutually exclusive classes or categories (or groups).

Mathematically, a quantitative variable may be classified as discrete or continuous. A discrete variable is one that can take only a discrete set of integers or whole numbers, which are the values are taken by jumps or breaks. A discrete variable represents count data such as the number of persons in a family, the number of rooms in a house, the number of deaths in an accident, the income of an individual, etc.

A variable is called a continuous variable if it can take on any value-fractional or integral––within a given interval, that is, its domain is an interval with all possible values without gaps. A continuous variable represents measurement data such as the age of a person, the height of a plant, the weight of a commodity, the temperature at a place, etc.

A variable whether countable or measurable is generally denoted by some symbol such as $X$ or $Y$ and $X_i$ or $X_j$ represents the $i$th or $j$th value of the variable. The subscript $i$ or $j$ is replaced by a number such as $1,2,3, \cdots, n$ when referred to a particular value.

## Qualitative and Quantitative Research

In this post, we will discuss Qualitative and Quantitative Research. Qualitative and Quantitative Research involves collecting data based on some qualities and quantities, respectively.

Qualitative Research

Qualitative research involves collecting data from in-depth interviews, observations, field notes, and open-ended questions in questionnaires, etc. The researcher himself is the primary data collection instrument and the data could be collected in form of words, images, and patterns, etc. For Qualitative Research, Data Analysis involves searching for patterns, themes, and holistic features. Results of such research are likely to be context-specific and reporting takes the form of a narrative with contextual description and direct quotations from researchers.

Quantitative Research

Quantitative research involves collecting quantitative data based on precise measurement using some structured, reliable, and validated collection instruments (questionnaire) or through archival data sources. The nature of quantitative data is in the form of variables and its data analysis involves establishing statistical relationships. If properly done, results of such research are generalized able to the entire population. Quantitative research could be classified into two groups depending on the data collection methodologies:

Experimental Research

The main purpose of experimental research is to establish a cause-and-effect relationship. The defining characteristics of experimental research are active manipulation of independent variables and the random assignment of participants to the conditions to be manipulated, everything else should be kept as similar and as constant as possible. To depict the way experiments are conducted, a term used is called the design of the experiment. There are two main types of experimental design.

Within-Subject Design
In a within-subject design, the same group of subjects serves in more than one treatment

Between Subjects Design
In between-group design, two or more groups of subjects, each of which being tested by a different testing factor simultaneously.

Non-Experimental Research

Non-Experimental Research is commonly used in sociology, political science, and management disciplines. This kind of research is often done with the help of a survey. There is no random assignment of participants to a particular group nor do we manipulate the independent variables. As a result, one cannot establish a cause and effect relationship through non-experimental research. There are two approaches to analyzing such data:

Tests for approaches to analyzing such data such as IQ level of participants from different ethnic backgrounds.

Tests for significant association between two factors such as firm sales and advertising expenditure.

## Skewness and Measures of Skewness

If the curve is symmetrical, a deviation below the mean exactly equals the corresponding deviation above the mean. This is called symmetry. Here, we will discuss Skewness and Measures of Skewness.

Skewness is the degree of asymmetry or departure from the symmetry of a distribution. Positive Skewness means when the tail on the right side of the distribution is longer or fatter. The mean and median will be greater than the mode. Negative Skewness is when the tail of the left side of the distribution is longer or fatter than the tail on the right side.

Measures of Skewness

Karl Pearson Measure of Relative Skewness
In a symmetrical distribution, the mean, median, and mode coincide. In skewed distributions, these values are pulled apart; the mean tends to be on the same side of the mode as the longer tail. Thus, a measure of the asymmetry is supplied by the difference ($mean – mode$). This can be made dimensionless by dividing by a measure of dispersion (such as SD). The Karl Pearson measure of relative skewness is
$$\text{SK} = \frac{\text{Mean}-\text{mode}}{SD} =\frac{\overline{x}-\text{mode}}{s}$$
The value of skewness may be either positive or negative.

The empirical formula for skewness (called second coefficients of skewness) is

$$\text{SK} = \frac{3(\text{mean}-\text{median})}{SD}=\frac{3(\tilde{X}-\text{median})}{s}$$

Bowley Measure of Skewness
In a symmetrical distribution, the quartiles are equidistant from the median ($Q_2-Q_1 = Q_3-Q_2$). If the distribution is not symmetrical, the quartiles will not be equidistant from the median (unless the entire asymmetry is located in the extreme quarters of the data). The Bowley suggested measure of skewness is

$$\text{Quartile Coefficient of SK} = \frac{Q_(2-Q_2)-(Q_2-Q_1)}{Q_3-Q_1}=\frac{Q_2-2Q_2+Q_1}{Q_3-Q_1}$$

This measure is always zero when the quartiles are equidistant from the median and is positive when the upper quartile is farther from the median than the lower quartile. This measure of skewness varies between $+1$ and $-1$.
Moment Coefficient of Skewness
In any symmetrical curve, the sum of odd powers of deviations from the mean will be equal to zero. That is, $m_3=m_5=m_7=\cdots=0$. However, it is not true for asymmetrical distributions. For this reason, a measure of skewness is devised on the basis of $m_3$. That is

\begin{align}
\text{Moment of Coefficient of SK}&= a_3=\frac{m_3}{s^3}=\frac{m_3}{\sqrt{m_2^3}}\\
&=b_1=\frac{m_3^2}{m_2^3}
\end{align}

For perfectly symmetrical curves (normal curves), $a_3$ and $b_1$ are zero.