Contingency Tables

Introduction to Contingency Tables

Contingency Tables also called cross tables or two-way frequency tables describe the relationship between several categorical (qualitative) variables. A bivariate relationship is defined by the joint distribution of the two associated random variables.

Contingency Tables

Let $X$ and $Y$ be two categorical response variables. Let variable $X$ have $I$ levels and variable $Y$ have $J$. The possible combinations of classifications for both variables are $I\times J$. The response $(X, Y)$ of a subject randomly chosen from some population has a probability distribution, which can be shown in a rectangular table having $I$ rows (for categories of $X$) and $J$ columns (for categories of $Y$).

The cells of this rectangular table represent the $IJ$ possible outcomes. Their probability (say $\pi_{ij}$) denotes the probability that ($X, Y$) falls in the cell in row $i$ and column $j$. When these cells contain frequency counts of outcomes, the table is called a contingency or cross-classification table and it is referred to as a $I$ by $J$ ($I \times J$) table.

Joint and Marginal Distribution

The probability distribution {$\pi_{ij}$} is the joint distribution of $X$ and $Y$. The marginal distributions are the rows and columns totals obtained by summing the joint probabilities. For the row variable ($X$) the marginal probability is denoted by $\pi_{i+}$ and for column variable ($Y$) it is denoted by $\pi_{+j}$, where the subscript “+” denotes the sum over the index it replaces; that is, $\pi_{i+}=\sum_j \pi_{ij}$ and $\pi_{+j}=\sum_i \pi_{ij}$ satisfying

$l\sum_{i} \pi_{i+} =\sum_{j} \pi_{+j} = \sum_i \sum_j \pi_{ij}=1$

Note that the marginal distributions are single-variable information, and do not pertain to association linkages between the variables.

Contingency Tables, Cross Tabulation

In (many) contingency tables, one variable (say, $Y$) is a response, and the other $X$) is an explanatory variable. When $X$ is fixed rather than random, the notation of a joint distribution for $X$ and $Y$ is no longer meaningful. However, for a fixed level of $X$, the variable $Y$ has a probability distribution. It is germane to study how this probability distribution of $Y$ changes as the level of $X$ changes.

Contingency Table Uses

  • Identify relationships between categorical variables.
  • See if one variable is independent of the other (i.e. if the frequency of one category is the same regardless of the other variable’s category).
  • Calculate probabilities of specific combinations occurring.
  • Often used as a stepping stone for further statistical analysis, like chi-square tests, to determine if the observed relationship between the variables is statistically significant.

Read More about Contingency Tables

https://itfeature.com

Computer MCQs Test Online

R Programming Language

Errors in Measurement

Errors in Measurement: It is a fact and from experience, it is observed that a continuous variable can not be measured with perfect (true) value because of certain habits and practices, measurement methods (techniques), instruments (or devices) used, etc. It means that the measurements are thus always recorded correctly to the nearest units and hence are of limited accuracy. The actual values are, however, assumed to exist.

Errors in Measurement Example

For example, if the weight of a student is recorded as 60 kg (correct to the nearest kilogram), his/her true (actual) weight, may lie between 59.5 kg and 60.5 kg. The weight recorded as 60.00 kg for that student means the true weight is known to lie between 59.995 and 60.005 kg.

Thus, there is a difference, however, it is small which may be between the measured value and the true value. This sort of departure from the true value is technically known as errors in measurement. In other words, if the observed value and the true value of a variable are denoted by $x$ and $x + \varepsilon$, respectively, then the difference $(x + \varepsilon) – x=\varepsilon$, is the error. This error involves the unit of measurement of $x$ and is, therefore, called an absolute error.

An absolute error divided by the true value is called the relative error. Thus the relative error can be measured as $\frac{\varepsilon}{x+\varepsilon}$. Multiplying this relative error by 100 gives the percentage error. These errors are independent of the units of measurement of $x$. It ought to be noted that an error has both magnitude and direction and that the word error in statistics does not mean a mistake which is a chance inaccuracy.

Errors in Measurement

An error is said to be biased when the observed value is higher or lower than the true value. Biased errors arise from the personal limitations of the observer, the imperfection in the instruments used, or some other conditions that control the measurements. These errors are not revealed by repeating the measurements. They are cumulative, that is, the greater the number of measurements, the greater would be the magnitude of the error. They are thus more troublesome. These errors are also called cumulative or systematic errors.

An error, on the other hand, is said to be unbiased when the deviations from the true value tend to occur equally often. Unbiased errors tend to cancel out in the long run. These errors are therefore compensating and are also known as random errors or accidental errors.

https://itfeature.com

We can reduce errors in measurement by

  • Double-checking all measurements for accuracy
  • Double-checking the formulas are correct
  • Making sure observers and measurement takers are well-trained
  • Measuring with the instrument has the highest precision
  • Take the measurements under controlled conditions
  • Pilot test your measuring instruments
  • Use multiple measures for the same construct

Types of Errors: Errors can be classified into two main categories:

  • Random Errors: These are variations in the reading/recording due to limitations of the instrument being used, the environment, or even the person taking the measurement. These errors are random by nature and fluctuate slightly up or down from the true value with each measurement.
  • Systematic Errors: Systematic Errors are consistent errors that cause your measurements to deviate from the true value predictably. For example, a ruler with a slightly inaccurate scale would introduce a systematic error in every measurement you make with it.
Types of Errors

Learn about Data and Data Structure in R Language

Estimation, Approximating a Precise Value 1

Estimation (Approximating a Precise Value) is very useful especially when someone wishes to know whether he/ she has arrived at a logical solution to a problem under study. It is useful to learn about how to estimate the total sum of a bill to avoid immediate overpayments. For example, one can estimate the total amount of shop (supermarket) receipts. The estimate of these receipts can be done by rounding the amount of each item to the nearest half and keeping a running total mentally from the first item to the last one.

Estimation of a Utility Bill

Suppose the following is a shop receipt, with the estimated amount and running total. Consider, the estimation, approximating a precise value for a utility bill.
Shop Item, Actual Amount, Estimated Amount, Running Total.

Shop ItemActual AmountEstimated AmountRunning Total
Item 14.504.504.50
Item 23.503.508
Item 31.31.59.5
Item 40.600.510
Item 52.95313
Item 62.85316
Item 71.601.5017.5
Item 82.75320.5
Item 92.42.523
Total22.4523 

From the above example, it can be observed that estimation is a process of finding an estimate of a value. It saves time and results in the nearest possible exact value. An estimate can be overestimated (when the estimate exceeds the actual value) and underestimated (when the estimate falls short of the actual value).

Estimation, Approximating a Precise Value

In some cases, an estimate can be performed to round all of the numbers that you are working to the nearest 10 (or 100 or 1000) and then do the necessary calculations. In everyday life, the estimation can be used before you solve a problem in an easier and faster way. It helps you to determine whether your answer is reasonable. Estimation is also useful when you need an approximate amount instead of a precise value.

Visit Online Quiz Website: https://gmstat.com