Sampling Theory, Introduction, and Reasons to Sample (2015)

Introduction to Sampling Theory

Often we are interested in drawing some valid conclusions (inferences) about a large group of individuals or objects (called population in statistics). Instead of examining (studying) the entire group (population, which may be difficult or even impossible to examine), we may examine (study) only a small part (portion) of the population (an entire group of objects or people). Our objective is to draw valid inferences about certain facts about the population from results found in the sample; a process known as statistical inferences. The process of obtaining samples is called sampling and the theory concerning the sampling is called sampling theory.

Example

Example: We may wish to conclude the percentage of defective bolts produced in a factory during a given 6-day week by examining 20 bolts each day produced at various times during the day. Note that all bolts produced in this case during the week comprise the population, while the 120 selected bolts during 6 days constitute a sample.

In business, medical, social, and psychological sciences, etc., research, sampling theory is widely used for gathering information about a population. The sampling process comprises several stages:

  • Defining the population of concern
  • Specifying the sampling frame (set of items or events possible to measure)
  • Specifying a sampling method for selecting the items or events from the sampling frame
  • Determining the appropriate sample size
  • Implementing the sampling plan
  • Sampling and data collecting
  • Data that can be selected

Reasons to Study a Sample

When studying the characteristics of a population, there are many reasons to study a sample (drawn from the population under study) instead of the entire population such as:

  1. Time: it is difficult to contact every individual in the whole population
  2. Cost: The cost or expenses of studying all the items (objects or individuals) in a population may be prohibitive
  3. Physically Impossible: Some populations are infinite, so it will be physically impossible to check all items in the population, such as populations of fish, birds, snakes, and mosquitoes. Similarly, it is difficult to study the populations that are constantly moving, being born, or dying.
  4. Destructive Nature of items: Some items, objects, etc. are difficult to study as during testing (or checking) they are destroyed, for example, a steel wire is stretched until it breaks and the breaking point is recorded to have a minimum tensile strength. Similarly different electric and electronic components are checked and they are destroyed during testing, making it impossible to study the entire population as time, cost and destructive nature of different items prohibit to study of the entire population.
  5. Qualified and expert staff: For enumeration purposes, highly qualified and expert staff is required which is sometimes impossible. National and International research organizations, agencies, and staff are hired for enumeration purposive which is sometimes costly, needs more time (as a rehearsal of activity is required), and sometimes it is not easy to recruit or hire highly qualified staff.
  6. Reliability: Using a scientific sampling technique the sampling error can be minimized and the non-sampling error committed in the case of a sample survey is also minimal because qualified investigators are included.

Summary

Every sampling system is used to obtain some estimates having certain properties of the population under study. The sampling system should be judged by how good the estimates obtained are. Individual estimates, by chance, may be very close or may differ greatly from the true value (population parameter) and may give a poor measure of the merits of the system.

A sampling system is better judged by the frequency distribution of many estimates obtained by repeated sampling, giving a frequency distribution having a small variance and a mean estimate equal to the true value.

Click the link to Learn Sampling Theory, Sampling Frame, and Sampling Unit

Sampling Theory, Introduction and Reason to Sample

Learn R Programming Language

Design of Experiments Overview (2015)

Objectives of Design of Experiments

Regarding the Design of Experiments: an experiment is usually a test trial or series of tests. The objective of the experiment may either be

  1. Confirmation
  2. Exploration

Designing an experiment means, providing a plan and actual procedure for laying out the experiment. It is a design of any information-gathering exercise where variation is present under the full or no control of the experimenter. The experimenter in the design of experiments is often interested in the effect of some process or intervention (the treatment) on some objects (the experimental units) such as people, parts of people, groups of people, plants, animals, etc. So the experimental design is an efficient procedure for planning experiments so that the data obtained can be analyzed to yield objective conclusions.

In the observational study, the researchers observe individuals and measure variables of interest but do not attempt to influence the response variable, while in an experimental study, the researchers deliberately (purposely) impose some treatment on individuals and then observe the response variables. When the goal is to demonstrate cause and effect, the experiment is the only source of convincing data.

Design of Experiments

Statistical Design

By the Statistical Experimental Design, we refer to the process of planning the experiment, so that the appropriate data will be collected, which may be analyzed by statistical methods resulting in valid and objective conclusions. Thus there are two aspects to any experimental problem:

  1. The design of the experiments
  2. The statistical analysis of the data

Many experimental designs differ from each other primarily in the way, in which the experimental units are classified, before the application of treatment.

Design of Experiments (DOE) helps in

  • Identifying the relationships between cause and effect
  • Provide some understanding of interactions among causative factors
  • Determining the level at which to set the controllable factors to optimize reliability
  • Minimizing the experimental error i.e., noise
  • Improving the robustness of the design or process to variation

Learn more about Design of Experiments Terminology

Basic Principles of Design of Experiments

Online Multiple Choice Questions and Quiz Website

Level of Measurements in Statistics

Introduction to Level of Measurements in Statistics

Data can be classified according to the level of measurements in statistics, dictating the calculations that can be done to summarize and present the data (graphically), it also helps to determine, what statistical tests should be performed.

For example, suppose there are six colors of candies in a bag and you assign different numbers (codes) to them in such a way that brown candy has a value of 1, yellow 2, green 3, orange 4, blue 5, and red a value of 6. From this bag of candies, adding all the assigned color values and then dividing by the number of candies, yield an average value of 3.68. Does this mean that the average color is green or orange? Of course not. When computing statistic(s), it is important to recognize the data type, which may be qualitative (nominal and ordinal) and quantitative (interval and ratio).

The level of measurements in statistics has been developed in conjunction with the concepts of numbers and units of measurement. Statisticians classified measurements according to levels. There are four levels of measurement, namely, nominal, ordinal, interval, and ratio, described below.

Nominal Level of Measurement

At the nominal level of measurement, the observation of a qualitative variable can only be classified and counted. There is no particular order to the categories. Mode, frequency table (discrete frequency tables), pie chart, and bar graph are usually drawn for this level of measurement.

Ordinal Level of Measurement

In the ordinal level of measurement, data classification is presented by sets of labels or names that have relative values (ranking or ordering of values). For example, if you survey 1,000 people and ask them to rate a restaurant on a scale ranging from 0 to 5, where 5 shows a higher score (highest liking level) and zero shows the lowest (lowest liking level). Taking the average of these 1,000 people’s responses will have meaning. Usually, graphs and charts are drawn for ordinal data.

Level of Measurement

Interval Level of Measurement

Numbers also used to express the quantities, such as temperature, size of the dress, and plane ticket are all quantities. The interval level of measurement allows for the degree of difference between items but not the ratio between them. There is a meaningful difference between values, for example, 10 degrees Fahrenheit and 15 degrees is 5, and the difference between 50 and 55 degrees is also 5 degrees. It is also important that zero is just a point on the scale, it does not represent the absence of heat, just that it is a freezing point.

Ratio Level of Measurement

All of the quantitative data is recorded on the ratio level. It has all the characteristics of the interval level, but in addition, the zero points are meaningful and the ratio between two numbers is meaningful. Examples of ratio levels are wages, units of production, weight, changes in stock prices, the distance between home and office, height, etc.


Many of the inferential test statistics depend on the ratio and interval level of measurement. Many authors argue that interval and ratio measures should be named as scales.

Level of Measurements in Statistics

Importance of Level of Measurements in Statistics

Understanding the level of measurement in statistics, data is crucial for several reasons:

  • Choosing Appropriate Statistical Tests: Different statistical tests are designed for different levels of measurement. Using the wrong test on data with an inappropriate level of measurement can lead to misleading results and decisions.
  • Data Interpretation: The level of measurement determines how one can interpret the data and the conclusions can made. For example, average (mean) is calculated for interval and ratio data, but not for nominal or ordinal data.
  • Data analysis: The level of measurement influences the types of calculations and analyses one can perform on the data.

By correctly identifying the levels of measurement of the data, one can ensure that he/she is using appropriate statistical methods and drawing valid conclusions from the analysis.

Online MCQs Test Preparation Website

P-value Definition, Interpretation, Introduction, Significance

In this post, we will discuss the P-value definition, interpretation, introduction, and some related examples.

P-value Definition

The P-value also known as the observed level of significance or exact level of significance or the exact probability of committing a type-I error (probability of rejecting $H_0$, when it is true), helps to determine the significance of results from the hypothesis. The P-value is the probability of obtaining the observed sample results or a more extreme result when the null hypothesis (a statement about population) is true.

In technical words, one can define the P-value as the lowest level of significance at which a null hypothesis can be rejected. If the P-value is very small or less than the threshold value (chosen level of significance), then the observed data is considered inconsistent with the assumption that the null hypothesis is true, and thus null hypothesis must be rejected while the alternative hypothesis should be accepted. A P-value is a number between 0 and 1 in literature.

Usual P-value Interpretation

  • A small P-value (<0.05) indicates strong evidence against the null hypothesis
  • A large P-value (>0.05) indicates weak evidence against the null hypothesis.
  • p-value very close to the cutoff (say 0.05) is considered to be marginal.

Let the P-value of a certain test statistic is 0.002 then it means that the probability of committing a type-I error (making a wrong decision) is about 0.2 percent, which is only about 2 in 1,000. For a given sample size, as | t | (or any test statistic) increases the P-value decreases, so one can reject the null hypothesis with increasing confidence.

p value and significance level

Fixing the significance level ($\alpha$, i.e. type-I error) equal to the p-value of a test statistic then there is no conflict between the two values, in other words, it is better to give up fixing up (significance level) arbitrary at some level of significance such as (5%, 10%, etc.) and simply choose the P-value of the test statistic. For example, if the p-value of the test statistic is about 0.145 then one can reject the null hypothesis at this exact significance level as nothing wrong with taking a chance of being wrong 14.5% of the time someone rejects the null hypothesis.

P-value addresses only one question: how likely are your data, assuming a true null hypothesis? It does not measure support for the alternative hypothesis.

Most authors refer to a P-value<0.05 as statistically significant and a P-value<0.001 as highly statistically significant (less than one in a thousand chance of being wrong).

P-value Definition, P-value Interpretation

The P-value interpretation is usually incorrect as it is usually interpreted as the probability of making a mistake by rejecting a true null hypothesis (a Type-I error). The P-value cannot be the error rate because:

The P-value is calculated based on the assumption that the null hypothesis is true and that the difference in the sample is by random chance. Consequently, a p-value cannot tell about the probability that the null hypothesis is true or false because it is 100% true from the perspective of the calculations.

https://itfeature.com

Read More about P-value definition, interpretation, and misinterpretation

Read More on Wiki-Pedia

SPSS Data Analysis

Online MCQs Quiz Website