Design of Experiments Overview (2015)

Objectives of Design of Experiments

Regarding the Design of Experiments: an experiment is usually a test trial or series of tests. The objective of the experiment may either be

  1. Confirmation
  2. Exploration

Designing an experiment means, providing a plan and actual procedure for laying out the experiment. It is a design of any information-gathering exercise where variation is present under the full or no control of the experimenter. The experimenter in the design of experiments is often interested in the effect of some process or intervention (the treatment) on some objects (the experimental units) such as people, parts of people, groups of people, plants, animals, etc. So the experimental design is an efficient procedure for planning experiments so that the data obtained can be analyzed to yield objective conclusions.

In the observational study, the researchers observe individuals and measure variables of interest but do not attempt to influence the response variable, while in an experimental study, the researchers deliberately (purposely) impose some treatment on individuals and then observe the response variables. When the goal is to demonstrate cause and effect, the experiment is the only source of convincing data.

Design of Experiments

Statistical Design

By the Statistical Experimental Design, we refer to the process of planning the experiment, so that the appropriate data will be collected, which may be analyzed by statistical methods resulting in valid and objective conclusions. Thus there are two aspects to any experimental problem:

  1. The design of the experiments
  2. The statistical analysis of the data

Many experimental designs differ from each other primarily in the way, in which the experimental units are classified, before the application of treatment.

Design of Experiments (DOE) helps in

  • Identifying the relationships between cause and effect
  • Provide some understanding of interactions among causative factors
  • Determining the level at which to set the controllable factors to optimize reliability
  • Minimizing the experimental error i.e., noise
  • Improving the robustness of the design or process to variation

Learn more about Design of Experiments Terminology

Basic Principles of Design of Experiments

Online Multiple Choice Questions and Quiz Website

Level of Measurements in Statistics

Introduction to Level of Measurements in Statistics

Data can be classified according to the level of measurements in statistics, dictating the calculations that can be done to summarize and present the data (graphically), it also helps to determine, what statistical tests should be performed.

For example, suppose there are six colors of candies in a bag and you assign different numbers (codes) to them in such a way that brown candy has a value of 1, yellow 2, green 3, orange 4, blue 5, and red a value of 6. From this bag of candies, adding all the assigned color values and then dividing by the number of candies, yield an average value of 3.68. Does this mean that the average color is green or orange? Of course not. When computing statistic(s), it is important to recognize the data type, which may be qualitative (nominal and ordinal) and quantitative (interval and ratio).

The level of measurements in statistics has been developed in conjunction with the concepts of numbers and units of measurement. Statisticians classified measurements according to levels. There are four levels of measurement, namely, nominal, ordinal, interval, and ratio, described below.

Nominal Level of Measurement

At the nominal level of measurement, the observation of a qualitative variable can only be classified and counted. There is no particular order to the categories. Mode, frequency table (discrete frequency tables), pie chart, and bar graph are usually drawn for this level of measurement.

Ordinal Level of Measurement

In the ordinal level of measurement, data classification is presented by sets of labels or names that have relative values (ranking or ordering of values). For example, if you survey 1,000 people and ask them to rate a restaurant on a scale ranging from 0 to 5, where 5 shows a higher score (highest liking level) and zero shows the lowest (lowest liking level). Taking the average of these 1,000 people’s responses will have meaning. Usually, graphs and charts are drawn for ordinal data.

Level of Measurement

Interval Level of Measurement

Numbers also used to express the quantities, such as temperature, size of the dress, and plane ticket are all quantities. The interval level of measurement allows for the degree of difference between items but not the ratio between them. There is a meaningful difference between values, for example, 10 degrees Fahrenheit and 15 degrees is 5, and the difference between 50 and 55 degrees is also 5 degrees. It is also important that zero is just a point on the scale, it does not represent the absence of heat, just that it is a freezing point.

Ratio Level of Measurement

All of the quantitative data is recorded on the ratio level. It has all the characteristics of the interval level, but in addition, the zero points are meaningful and the ratio between two numbers is meaningful. Examples of ratio levels are wages, units of production, weight, changes in stock prices, the distance between home and office, height, etc.


Many of the inferential test statistics depend on the ratio and interval level of measurement. Many authors argue that interval and ratio measures should be named as scales.

Level of Measurements in Statistics

Importance of Level of Measurements in Statistics

Understanding the level of measurement in statistics, data is crucial for several reasons:

  • Choosing Appropriate Statistical Tests: Different statistical tests are designed for different levels of measurement. Using the wrong test on data with an inappropriate level of measurement can lead to misleading results and decisions.
  • Data Interpretation: The level of measurement determines how one can interpret the data and the conclusions can made. For example, average (mean) is calculated for interval and ratio data, but not for nominal or ordinal data.
  • Data analysis: The level of measurement influences the types of calculations and analyses one can perform on the data.

By correctly identifying the levels of measurement of the data, one can ensure that he/she is using appropriate statistical methods and drawing valid conclusions from the analysis.

Online MCQs Test Preparation Website

P-value Definition, Interpretation, Introduction, Significance

In this post, we will discuss the P-value definition, interpretation, introduction, and some related examples.

P-value Definition

The P-value also known as the observed level of significance or exact level of significance or the exact probability of committing a type-I error (probability of rejecting $H_0$, when it is true), helps to determine the significance of results from the hypothesis. The P-value is the probability of obtaining the observed sample results or a more extreme result when the null hypothesis (a statement about population) is true.

In technical words, one can define the P-value as the lowest level of significance at which a null hypothesis can be rejected. If the P-value is very small or less than the threshold value (chosen level of significance), then the observed data is considered inconsistent with the assumption that the null hypothesis is true, and thus null hypothesis must be rejected while the alternative hypothesis should be accepted. A P-value is a number between 0 and 1 in literature.

Usual P-value Interpretation

  • A small P-value (<0.05) indicates strong evidence against the null hypothesis
  • A large P-value (>0.05) indicates weak evidence against the null hypothesis.
  • p-value very close to the cutoff (say 0.05) is considered to be marginal.

Let the P-value of a certain test statistic is 0.002 then it means that the probability of committing a type-I error (making a wrong decision) is about 0.2 percent, which is only about 2 in 1,000. For a given sample size, as | t | (or any test statistic) increases the P-value decreases, so one can reject the null hypothesis with increasing confidence.

p value and significance level

Fixing the significance level ($\alpha$, i.e. type-I error) equal to the p-value of a test statistic then there is no conflict between the two values, in other words, it is better to give up fixing up (significance level) arbitrary at some level of significance such as (5%, 10%, etc.) and simply choose the P-value of the test statistic. For example, if the p-value of the test statistic is about 0.145 then one can reject the null hypothesis at this exact significance level as nothing wrong with taking a chance of being wrong 14.5% of the time someone rejects the null hypothesis.

P-value addresses only one question: how likely are your data, assuming a true null hypothesis? It does not measure support for the alternative hypothesis.

Most authors refer to a P-value<0.05 as statistically significant and a P-value<0.001 as highly statistically significant (less than one in a thousand chance of being wrong).

P-value Definition, P-value Interpretation

The P-value interpretation is usually incorrect as it is usually interpreted as the probability of making a mistake by rejecting a true null hypothesis (a Type-I error). The P-value cannot be the error rate because:

The P-value is calculated based on the assumption that the null hypothesis is true and that the difference in the sample is by random chance. Consequently, a p-value cannot tell about the probability that the null hypothesis is true or false because it is 100% true from the perspective of the calculations.

https://itfeature.com

Read More about P-value definition, interpretation, and misinterpretation

Read More on Wiki-Pedia

SPSS Data Analysis

Online MCQs Quiz Website

The Degrees of Freedom

The degrees of freedom (df) or several degrees of freedom refers to the number of observations in a sample minus the number of (population) parameters being estimated from the sample data. All this means that the degrees of freedom are a function of both sample size and the number of independent variables. In other words, it is the number of independent observations out of a total of ($n$) observations.

Degrees of Freedom

In statistics, the degrees of freedom are considered as the number of values in a study that is free to vary. Degree of freedom example in real life; if you have to take ten different courses to graduate, and only ten different courses are offered, then you have nine degrees of freedom. Nine semesters you will be able to choose which class to take; the tenth semester, there will only be one class left to take – there is no choice, if you want to graduate, this is the concept of the degrees of freedom (df) in statistics.

Let a random sample of size $n$ be taken from a population with an unknown mean $\overline{X}$. The sum of the deviations from their means is always equal to zero i.e.$\sum_{i=1}^n (X_i-\overline{X})=0$. This requires a constraint on each deviation $X_i-\overline{X}$ used when calculating the variance.

\[S^2 =\frac{\sum_{i=1}^n (X_i-\overline{X})^2 }{n-1}\]

This constraint (restriction) implies that $n-1$ deviations completely determine the nth deviation. The $n$ deviations (and also the sum of their squares and the variance in the $S^2$ of the sample) therefore $n-1$ degrees of freedom.

A common way to think of df is the number of independent pieces of information available to estimate another piece of information. More concretely, the number of degrees of freedom is the number of independent observations in a sample of data that are available to estimate a parameter of the population from which that sample is drawn. For example, if we have two observations, when calculating the mean we have two independent observations; however, when calculating the variance, we have only one independent observation, since the two observations are equally distant from the mean.

Degrees of Freedom

Single sample: For $n$ observation one parameter (mean) needs to be estimated, which leaves $n-1$ degree of freedom for estimating variability (dispersion).

Two samples: There are a total of $n_1+n_2$ observations ($n_1$ for group1 and $n_2$ for group2) and two means need to be estimated, which leaves $n_1+n_2-2$ degree of freedom for estimating variability.

Regression with p predictors: There are $n$ observations with $p+1$ parameters that need to be estimated (regression coefficient for each predictor and the intercept). This leaves $n-p-1$ degrees of freedom of error, which accounts for the error degrees of freedom in the ANOVA table.

Several commonly encountered statistical distributions (Student’s t, Chi-Squared, F) have parameters that are commonly referred to as degrees of freedom. This terminology simply reflects that in many applications where these distributions occur, the parameter corresponds to the degrees of freedom of an underlying random vector. If $X_i; i=1,2,\cdots, n$ are independent normal $(\mu, \sigma^2)$ random variables, the statistic (formula) is $\frac{\sum_{i=1}^n (X_i-\overline{X})^2}{\sigma^2}$, follows a chi-squared distribution with $n-1$ degree of freedom. Here, the degree of freedom arises from the residual sum of squares in the numerator and in turn the $n-1$ degree of freedom of the underlying residual vector $X_i-\overline{X}$.

itfeature.com The Degrees of Freedom

Computer MCQs Online Test

R Programming Language