Currently working as Assistant Professor of Statistics in Ghazi University, Dera Ghazi Khan.
Completed my Ph.D. in Statistics from the Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan.
l like Applied Statistics, Mathematics, and Statistical Computing.
Statistical and Mathematical software used is SAS, STATA, Python, GRETL, EVIEWS, R, SPSS, VBA in MS-Excel.
Like to use type-setting LaTeX for composing Articles, thesis, etc.
A random variable in statistics is a variable whose value depends on the outcome of a probability experiment. As in algebra, random variables are represented by letters such as $X$, $Y$, and $Z$. A random variable in statistics is a variable whose value is determined by chance. A random variable is a function that maps outcomes to numbers. Read more about random variables in Statistics: Random Variable.
Table of Contents
Random Variable in Statistics: Some Examples
T = the number of tails when a coin is flipped 3 times.
s = the sum of the values showing when two dice are rolled.
h = the height of a woman chosen at random from a group.
V = the liquid volume of soda in a can marked 12 oz.
W = The weight of an infant chosen at random in a hospital.
Key Characteristics of a Random Variable
Randomness: The value of a random variable is determined by chance.
Numerical: It assigns numbers to outcomes.
Function: It is technically a function that maps outcomes to numbers.
Types of Random Variables
There are two basic types of random variables.
Discrete Random Variables: A discrete random variable can take on only a countable number of values. It can have a finite or countable number of possible values.
Continuous Random Variables: A continuous random variable Can take on any value within a specified interval. It can take on any value in some interval.
Examples of Discrete and Continuous Random Variables
• The variables $T$ and $s$ from above are discrete random variables • The variables $h$, $V$, and $W$ from above are continuous random variables.
Importance of Random Variables in Statistics
Random variables are fundamental to statistics. Random variables allow us to:
Use mathematical tools to analyze uncertain events.
Model the real-world phenomena.
Calculate probabilities of events.
Compute expected values and variances.
Make statistical inferences.
Random variables form the basis for probability distributions and are fundamental to statistical inference. Random variables provide a bridge between the real world of uncertainty and the mathematical world of probability.
The following is the outline of the factors relevant to the choice of statistical tests and a set of three charts that may be used to guide your selection of a test.
Table of Contents
Choosing the right statistical test depends on
Nature of the data
Sample characteristics
Inferences to be made
A consideration of the nature of data includes
Number of variables
Not for the entire study, but for the specific question at hand
Type of data
Numerical, continuous
Dichotomous, categorical information
Rank-order or ordinal
A consideration of the sample characteristics includes
Number of groups
Sample type
Normal distribution (parametric) or not (non-parametric)
Independent or dependent
A consideration of the inferences to be made includes
Data represent the population
The group means are different
There is a relationship between variables
Before choosing a statistical test, ask
How many variables?
How many groups?
Is the distribution of data normal?
Are the samples (groups) independent?
What is your hypothesis or research question?
Is the data continuous, ordinal, or categorical?
In situations where one variable is studied, the chart below may guide your selection of statistical tests.
In situations where two variables are studied, the chart below may guide your selection of statistical tests.
In situations where three or more variables are studied, the chart below may guide your selection of statistical tests.
In summary,
Statistical significance indicates the probability that results were chance findings
The choice of a statistical test depends on the data, sample characteristics, and research question
A one factor design (also known as a one-way ANOVA) is a statistical method used to determine if there are significant differences between the means of multiple groups. In this design, there is one independent variable (factor) with multiple levels or categories.
Table of Contents
Suppose $y_{ij}$ is the response is the $i$th treatment for the $j$th experimental unit, where $i=1,2,\cdots, I$. The statistical model for a completely randomized one-factor design that leads to a One-Way ANOVA is
$$y_{ij} = \mu_i + e_{ij}$$
where $\mu_i$ is the unknown (population) mean for all potential responses to the $i$th treatment, and $e_{ij}$ is the error (deviation of the response from population mean).
The responses within and across treatments are assumed to be independent and normally distributed random variables with constant variance.
One Factor Design’s Statistical Model
Let $\mu = \frac{1}{I} \sum \limits_{i} \mu_i$ be the grand mean or average of the population means. Let $\alpha_i=\mu_i-\mu$ be the $i$th group treatment effect. The treatment effects are constrained to add to zero ($\alpha_1+\alpha_2+\cdots+\alpha_I=0$) and measure the difference between the treatment population means and the grand mean.
Therefore the one way ANOVA model is $$y{ij} = \mu + \alpha_i + e_{ij}$$
From this model, the hypothesis of interest is whether the population means are equal:
$$H_0:\mu_1=\mu_2= \cdots = \mu_I$$
The hypothesis is equivalent to $H_0:\alpha_1 = \alpha_2 =\cdots = \alpha_I=0$. If $H_0$ is true, then the one-way ANOVA model is
$$ y_{ij} = \mu + e_{ij}$$ where $\mu$ is the common population mean.
One Factor Design Example
Let’s say you want to compare the average test scores of students from three different teaching methods (Method $A$, Method $B$, and Method $C$).
Independent variable: Teaching method (with three levels: $A, B, C$)
Dependent variable: Test scores
When to Use a One Factor Design
Comparing means of multiple groups: When one wants to determine if there are significant differences in the mean of a dependent variable across different groups or levels of a factor.
Exploring the effect of a categorical variable: When one wants to investigate how a categorical variable influences a continuous outcome.
Assumptions of One-Factor ANOVA
Normality: The data within each group should be normally distributed.
Homogeneity of variance (Equality of Variances): The variances of the populations from which the samples are drawn should be equal.
Independence: The observations within each group should be independent of each other.
When to Use One Factor Design
When one wants to compare the means of multiple groups.
When the independent variable has at least three levels.
When the dependent variable is continuous (e.g., numerical).
Note that
If The Null hypothesis is rejected, one can perform post-hoc tests (for example, Tukey’s HSD, Bonferroni) to determine which specific groups differ significantly from each other.
Remember: While one-factor designs are useful for comparing multiple groups, they cannot establish causation.
Statistics is used as a tool to make appropriate decisions in the face of uncertainty. We all apply statistical concepts in our daily life either we are educated or uneducated. Therefore the importance of Statistics cannot be ignored.
Table of Contents
The information collected in the form of data (observation) from any field/discipline will almost always involve some sort of variability or uncertainty, so this subject has applications in almost all fields of research. The researchers use statistics in the analysis, interpretation, and communication of their research findings.
Some examples of the questions which statistics might help to answer with appropriate data are:
How much better yield of wheat do we get if we use a new fertilizer as opposed to a commonly used fertilizer?
Does the company’s sales are likely to increase in the next year as compared to the previous?
What dose of insecticide is used successfully to monitor an insect population?
What is the likely weather in the coming season?
Application of Statistics
Statistical techniques being powerful tools for analyzing numerical data are used in almost every branch of learning. Statistics plays a vital role in every field of human activity. Statistics has an important role in determining the existing position of per capita income, unemployment, population growth rate, housing, schooling medical facilities, etc in a country. Now statistics holds a central position in almost every field like Industry, Commerce, Biological and Physical sciences, Genetics, Agronomy, Anthropometry, Astronomy, Geology, Psychology, Sociology, etc are the main areas where statistical techniques have been developed and are being used increasingly.
Statistics has its application in almost every field where research is carried out and findings are reported. Application of statistics (by keeping in view the importance of statistics) in different fields as follows:
Social Sciences
In social sciences, one of the major objectives is to establish a relationship that exists between certain variables. This end is achieved through postulating hypothesis and their testing by using different statistical techniques. Most of the areas of our economy can be studied by econometric models because these help in forecasting and forecasts are important for future planning.
Plant Sciences
The most important aspect of statistics in plant sciences is its role in the efficient planning of experiments and drawing valid conclusions. A technique in statistics known as “Design of Experiments” helps introduce new varieties. Optimum plot sizes can be worked out for different crops like wheat, cotton, sugarcane, and others under different environmental conditions using statistical techniques.
Physical Sciences
The application of statistics in physical sciences is widely accepted. The researchers use these methods in the analysis, interpretation, and communication of their research findings, linear and nonlinear regression models are used to establish cause and effect relationships between different variables, and also these days computers have facilitated experimentation and it is possible to simulate the process rather than experimentation.
Medical Sciences
The interest may be in the effectiveness of new drugs, the effect of environmental factors, heritability, standardization of various records, and other related problems. Statistics come to the rescue. It helps to plan the next investigation to get trustworthy information from the limited resources. It also helps to analyze the current data and integrate the information with that previously existing.
How statistics is used by banks, insurance companies, Business and economic planning and administration, Accounting and controlling of events, Construction Companies, Politicians
Banks
Banks are a very important economic part of a country. They do their work on the guess that all the depositors do not take their money on the same day. Bankers use probability theory to approximate the deposits and claims to take out their money.
Insurance Companies
Insurance companies play an important role in increasing economic progress. These companies collect payment from the people. They approximate the death rate, accident rate, and average expected life of people from the life tables. The payment per month is decided on these rates.
Business
Business planning for the future is very important such as the price, quality, quantity, demand, etc of a particular product. Businessmen can make correct decisions about the location of the business, marketing of the products, financial resources, etc. Statistics helps a businessman to plan production according to the taste of the customers, the quality of the products can also be checked more efficiently by using statistical methods
The relationship between supply and demand is a very important topic of everyday life. The changes in prices and demands are studied by index numbers. The relation between supply and demand is determined by correlation and regression.
Economic Planning
Economic planning for the future is a very important problem for economists. For example (i) opening of new educational institutions such as schools, and colleges, revision of pay scales of employees, construction of new hospitals, and preparation of government budgets, etc. all these require estimates at some future time which is called forecasting which is done by regression analysis and the different sources of earning, planning of projects, forecasting of economic trends are administered by the use of various statistical techniques.
Accounting and Controlling of Events
All the events in the world are recorded, for example, births, deaths, imports, exports, and crops grown by the farmer etc. These are recorded as statistical data and analyzed to make better policies for the betterment of the nation.
Administrator
An administrator whether in the public or private sector leans on statistical data to provide a factual basis for appropriate decisions.
Politician
A politician uses statistical advantageously to lend support and credence to his argument while elucidating the problems he handles.
Construction Companies
All kinds of construction companies start and run their programs after making judgments about the total cost of the project (job, work). To guess the expenditure a very important statistical technique of estimation is used.
Biology
In biology correlation and regression are used for analysis of hereditary relations. To classify the organization into different classes according to their age, height, weight, hair color, eyebrow color, etc. the rules of classification are tabulation of statistics are used.