# Basic Statistics and Data Analysis

## Classical Probability: Example, Definition, and Uses in Life

Classical probability is the statistical concept that measures the likelihood (probability) of something happening. In a classic sense, it means that every statistical experiment will contain elements that are equally likely to happen (equal chances of occurrence of something). Therefore, the concept of classical probability is the simplest form of probability that has equal odds of something happening.

## Classical Probability Examples

Example 1: The typical example of classical probability would be rolling of a fair dice because it is equally probable that top face of die will be any of the 6 numbers on the die: 1, 2, 3, 4, 5, or 6.

Example 2: Another example of classical probability would be tossing an unbiased coin. There is an equal probability that your toss will yield either head or tail.

Example 3: In selecting bingo balls, each numbered ball has an equal chance of being chosen.

Example 4: Guessing a multiple choice quiz (MCQs) test with (say) four possible answers A, B, C or D. Each option (choice) has the same odds (equal chances) of being picked (assuming you pick randomly and do not follow any pattern).

## Formula for Classical Probability

The probability of a simple event happening is the number of times the event can happen, divided by the number of possible events (outcomes).

Mathematically $P(A) = \frac{f}{N}$,

where, $P(A)$ means “probability of event A” (event $A$ is whatever event you are looking for, like winning the lottery, that is event of interest), $f$ is the frequency, or number of possible times the event could happen and $N$ is the number of times the event could happen.

For example,  the odds of rolling a 2 on a fair die are one out of 6, (1/6). In other words, one possible outcome (there is only one way to roll a 1 on a fair die) divided by the number of possible outcomes.

Classical probability can be used for very basic events, like rolling a dice and tossing a coin, it can also be used when occurrence of all events is equally likely. Choosing a card from a standard deck of cards gives you a 1/52 chance of getting a particular card, no matter what card you choose. On the other hand, figuring out will it rain tomorrow or not isn’t something you can figure out with this basic type of probability. There might be a 15% chance of rain (and therefore, an 85% chance of it not raining).

## Other Examples of classical Probability

There are many other examples of classical probability problems besides rolling dice. These examples include flipping coins, drawing cards from a deck, guessing on a multiple choice test, selecting jellybeans from a bag, and choosing people for a committee, etc.

## Classical Probability cannot be used:

Dividing the number of events by the number of possible events is very simplistic, and it isn’t suited to finding probabilities for a lot of situations. For example, natural events like weights, heights, and test scores need normal distribution probability charts to calculate probabilities. In fact, most “real life” things aren’t simple events like coins, cards, or dice. You’ll need something more complicated than classical probability theory to solve them.

## Binomial Probability Distribution

A statistical experiment having successive independent trials having two possible outcomes (such as success and failure; true and false; yes and no; right and wrong etc.) and probability of success is equal for each trial, while this kind of experiment is repeated a fixed number of times (say $n$ times) is called Binomial Experiment, Each trial of this Binomial experiment is known as Bernoulli trial (a trial which is a single performance of an experiment), for example. There are four properties of Binomial Experiment.

1. Each trial of Binomial Experiment can be classified as success or failure.
2. The probability of success for each trial of the experiment is equal.
3. Successive trials are independent, that is, the occurrence of one outcome in an experiment does not affect occurrence of the other.
4. The experiment is repeated a fixed number of times.

## Binomial Probability Distribution

Let a discrete random variable, which denotes the number of successes of a Binomial Experiment (we call this binomial random variable). The random variable assume isolated values as $X=0,1,2,\cdots,n$. The probability distribution of binomial random variable is termed as binomial probability distribution. It is a discrete probability distribution.

## Binomial Probability Mass Function

The probability function of binomial distribution is also called binomial probability mass function and can be denoted by $b(x, n, p)$, that is, a binomial distribution of random variable $X$ with $n$ (given number of trials) and $p$ (probability of success) as parameters. If $p$ is the probability of success (alternatively $q=1-p$ is probability of failure such that $p+q=1$) then probability of exactly $x$ success can be found from the following formula,

\begin{align}
b(x, n, p) &= P(X=x)\\
&=\binom{n}{x} p^x q^{n-x}, \quad x=0,1,2, \cdots, n
\end{align}

where $p$ is probability of success of a single trial, $q$ is probability of failure and $n$ is number of independent trials.

The formula gives probability for each possible combination of $n$ and $p$ of binomial random variable $X$. Note that it does not give $P(X <0)$ and $P(X>n)$. Binomial distribution is suitable when $n$ is small and is applied when sampling done is with replacement.

$b(x, n, p) = \binom{n}{x} p^x q^{n-x}, \quad x=0,1,2,\cdots,n,$

is called Binomial distribution because its successive terms are exactly same as that of binomial expansion of

\begin{align}
(q+p)^n=\binom{0}{0} p^0 q^{n-0}+\binom{n}{1} p^1 q^{n-1}+\cdots+\binom{n}{n-1} p^n q^{n-(n-1)}+\binom{n}{n} p^n q^{n-n}
\end{align}

$\binom{n}{0}, \binom{n}{1}, \binom{n}{2},\cdots, \binom{n}{n-1}, \binom{n}{n}$ are called Binomial coefficients.

Note that it is necessary to describe the limit of the random variable otherwise it will be only the mathematical equation not the probability distribution.

# Covariance and Correlation

Covariance measures the degree to which two variables co-vary (i.e. vary/ changes together). If the greater values of one variable (say, $X_i$) correspond with the greater values of the other variable (say, $X_j$), i.e. if the variables tend to show similar behaviour, then the covariance between two variables ($X_i$, $X_j$) will be positive. Similarly if the smaller values of one variable correspond with the smaller values of the other variable, then the covariance between two variables will be positive. In contrast, if the greater values of one variable (say, $X_i$) mainly correspond to the smaller values of the other variables (say, $X_j$), i.e. both of the variables tend to show opposite behaviour, then the covariance will be negative.

In other words, for positive covariance between two variables means they (both of the variables) vary/changes together in the same direction relative to their expected values (averages). It means that if one variable moves above its average value, then the other variable tend to be above its average value also. Similarly, if covariance is negative between the two variables, then one variable tends to be above its expected value, while the other variable tends to be below its expected value. If covariance is zero then it means that there is no linear dependency between the two variables. Mathematically covariance between two random variables $X_i$ and $X_j$ can be represented as
$COV(X_i, X_j)=E[(X_i-\mu_i)(X_j-\mu_j)]$
where
$\mu_i=E(X_i)$ is the average of the first variable
$\mu_j=E(X_j)$ is the average of the second variable

\begin{aligned}
COV(X_i, X_j)&=E[(X_i-\mu_i)(X_j-\mu_j)]\\
&=E[X_i X_j – X_i E(X_j)-X_j E(X_i)+E(X_i)E(X_j)]\\
&=E(X_i X_j)-E(X_i)E(X_j) – E(X_j)E(X_i)+E(X_i)E(X_j)\\
&=E(X_i X_j)-E(X_i)E(X_j)
\end{aligned}

Note that, the covariance of a random variable with itself is the variance of the random variable, i.e. $COV(X_i, X_i)=VAR(X)$. If $X_i$ and $X_j$ are independent, then $E(X_i X_j)=E(X_i)E(X_j)$ and $COV(X_i, X_j)=E(X_i X_j)-E(X_i) E(X_j)=0$.

## Covariance and Correlation

Correlation and covariance are related measures but not equivalent statistical measures. The correlation between two variables (Let, $X_i$ and $X_j$) is their normalized covariance, defined as
\begin{aligned}
\rho_{i,j}&=\frac{E[(X_i-\mu_i)(X_j-\mu_j)]}{\sigma_i \sigma_j}\\
&=\frac{n \sum XY – \sum X \sum Y}{\sqrt{(n \sum X^2 -(\sum X)^2)(n \sum Y^2 – (\sum Y)^2)}}
\end{aligned}
where $\sigma_i$ is the standard deviation of $X_i$ and $\sigma_j$ is the standard deviation of $X_j$.

Note that correlation is the dimensionless, i.e. a number which is free of measurement unit and its values lies between -1 and +1 inclusive. In contrast covariance has a unit of measure–the product of the units of two variables.

# Introduction Odds Ratio

Medical students, students from clinical and psychological sciences, professionals allied to medicine enhancing their understanding and learning of medical literature and researchers from different fields of life usually encounter Odds Ratio (OR) throughout their careers.

Odds ratio is a relative measure of effect, allowing the comparison of the intervention group of a study relative to the comparison or placebo group. When computing Odds Ratio, one would do:

• The numerator is the odds in the intervention arm
• The denominator is the odds in the control or placebo arm= OR

If the outcome is the same in both groups, the ratio will be 1, implying that there is no difference between the two arms of the study. However, if the OR>1, the control group is better than the intervention group while, if the OR<1, the intervention group is better than the control group.

The ratio of the probability of success and failure is known as odds. If the probability of an event is $P_1$ then the odds is:
$OR=\frac{p_1}{1-p_1}$

The Odds Ratio is the ratio of two odds can be used to quantify how much a factor is associated to the response factor in a given model. If the probabilities of occurrences an event are $P_1$ (for first group) and $P_2$ (for second group), then the OR is:
$OR=\frac{\frac{p_1}{1-p_1}}{\frac{p_2}{1-p_2}}$

If predictors are binary then the OR for ith factor, is defined as
$OR_i=e^{\beta}_i$

The regression coefficient $b_1$ from logistic regression is the estimated increase in the log odds of the dependent variable per unit increase in the value of the independent variable. In other words, the exponential function of the regression coefficients $(e^{b_1})$ in the OR associated with a one unit increase in the independent variable.

## Binomial Random number Generation in R

We will learn here how to generate Bernoulli or Binomial distribution in R with example of flip of a coin. This tutorial is based on how to generate random numbers according to different statistical distributions in R. Our focus is in binomial random number generation in R.

We know that in Bernoulli distribution, either something will happen or not such as coin flip has to outcomes head or tail (either head will occur or head will not occur i.e. tail will occur). For unbiased coin there will be 50%  chances that head or tail will occur in the long run. To generate a random number that are binomial in R, use rbinom(n, size,prob) command.

rbinom(n, size, prob) command has three parameters, namely

where
n is number of observations
size is number of trials (it may be zero or more)
prob is probability of success on each trial for example 1/2

Some Examples

• One coin is tossed 10 times with probability of success=0.5
coin will be fair (unbiased coin as p=1/2)
>rbinom(n=10, size=1, prob=1/2)
OUTPUT: 1 1 0 0 1 1 1 1 0 1
• Two coins are tossed 10 times with probability of success=0.5
• > rbinom(n=10, size=2, prob=1/2)
OUTPUT: 2 1 2 1 2 0 1 0 0 1
• One coin is tossed one hundred thousand times with probability of success=0.5
> rbinom(n=100,000, size=1, prob=1/2)
• store simulation results in $x$ vector
> x<- rbinom(n=100,000, size=5, prob=1/2)
count 1’s in x vector
> sum(x)
find the frequency distribution
> table(x)
creates a frequency distribution table with frequency
> t=(table(x)/n *100)}
plot frequency distribution table
>plot(table(x),ylab=”Probability”,main=”size=5,prob=0.5″)