Data Collection Methods

There are many methods to collect data. These Data Collection Methods can be classified into four main methods (sources) of collecting data: used in statistical inference.

Data Collection Methods

The Data Collection Methods are (i) Survey Method (ii) Simulation (iii) Controlled Experiments (iv) Observational Study. Let us discuss Data Collection Methods one by one in detail.

(i) Survey Method

A very popular and widely used method is the survey, where people with special training go out and record observations of, the number of vehicles, traveling along a road, the acres of fields that farmers are using to grow a particular food crop; the number of households that own more than one motor vehicle, the number of passengers using Metro transport and so on. Here the person making the study has no direct control over generating the data that can be recorded, although the recording methods need care and control.

(ii) Simulation

Simulation is also one of the most important data collection methods. In Simulation, a computer model for the operation of an (industrial)  system is set up in which an important measurement is the percentage purity of a (chemical) product. A very large number of realizations of the model can be run to look for any pattern in the results. Here the success of the approach depends on how well the model can explain that measurement and this has to be tested by carrying out at least a small amount of work on the actual system in operation.

(iii) Controlled Experiments

An experiment is possible when the background conditions can be controlled, at least to some extent. For example, we may be interested in choosing the best type of grass seed to use in the sports field.

The first stage of work is to grow all the competing varieties of seed at the same place and make suitable records of their growth and development. The competing varieties should be grown in quite small units close together in the field as in the figure below

This is a controlled experiment as it has certain constraints such as;

i) River on the right side
ii) Shadow of trees on the left side
iii) There are 3 different varieties (say, $v_1, v_2, v_3$) and are distributed in 12 units.

In the diagram below, much more control of local environmental conditions than there would have been if one variety had been replaced in the strip in the shelter of the trees, another close by the river while the third one is more exposed in the center of the field;

There are 3 experimental units. One is close to the stream and the other is to trees while the third one is between them which is more beneficial than the others. It is now our choice where to place any one of them on any of the sides.

(iv) Observational Study

Like experiments, observational studies try to understand cause-and-effect relationships. However, unlike experiments, the researcher is not able to control (1) how subjects are assigned to groups and/or (2) which treatments each group receives.

Note that small units of land or plots are called experimental units or simply units.

There is no “right” side for a unit, it depends on the type of crop, the work that is to be done on it, and the measurements that are to be taken. Similarly, the measurements upon which inferences are eventually going to be based are to be taken as accurately as possible. The unit must, therefore, need not be so large as to make recording very tedious because that leads to errors and inaccuracy. On the other hand, if a unit is very small there is the danger that relatively minor physical errors in recording can lead to large percentage errors.

Experimenters and statisticians who collaborate with them, need to gain a good knowledge of their experimental material or units as a research program proceeds.

R Data Analysis and Statistics

MCQs Mathematics Intermediate

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading