## Short Questions and Answers about Basic Statistics

This page contains short questions and answers about introductory statistics which includes history of statistics, Meaning of Statistics, Symbols and Notations, Data and Variable, Uses and Limitation of statistics.

**Meaning of Statistics**

The word **Statistics** is from the Latin word *status*, meaning political state, and originally meant information useful to the state, but now this word is being used in different meanings.

- In the first place, the word
*statistics*refers to "numerical facts arranged systematically". - In the second place, the word
*statistics*is defined as a discipline that includes all the procedures and techniques which are used to collect, process and analyze numerical data, to make inferences so that appropriate decisions can be made in the face of uncertainty. - Thirdly, the word
*statistics*are numerical quantities calculated from sample observations.

**Formal Definition of Statistics**

Statistics is a branch of mathematics which involves the collection, organization, interpretation, and presentation of data, To draw inference about the data collected from population under study

**Branches of Statistics**

As a subject **Statistics ** can be divided into *Descriptive Statistics * and *Inferential Statistics*.

**Descriptive Statistics**

The branch of statistics which deals with concepts and methods concerned with summarization and description of the important aspects of numerical data. Here data is condensed to have some graphs, table and numerical quantities that provide information about the center of the data and indicate the dispersion of observations.

**Inferential Statistics**

The branch of statistics which deals with procedures for making inferences about the characteristics of the larger group of data or population, from the knowledge derived from only the part of data i.e. sample. Here Estimation of population parameter and testing of hypothesis is done which based on probability theory, and inferences are made on the basis of sample evidence therefore cannot be absolutely certain.

- It only deals with behavior of aggregates or large groups of data. It has nothing to do with what is happening to a particular individual or object of the aggregate
- It deals with aggregates of observations of the same kind rather than isolated figures.
- Statistics deals with variability that obscure (difficult to find) the underlying patterns.
- It deals with uncertainties (Probability) as every process of getting observation either controlled or uncontrolled, involves deficiencies or chance of variations.
- It deals with those aspects of things which can be described numerically either by counts or by measurements.
- Statistics deals with those aggregates which are subject to a number of random causes such as heights of persons are subject to a number of causes e.g. race, ancestry, age, diet, habits, climate etc.
- It deals with those characteristics of things which can be numerically described
- Statistical laws are valid on the average or in the long run. There is no guarantee that a certain law will hold in all cases.
- Statistical results might be misleading and/ or incorrect if sufficient care in collecting, processing and interpreting the data is not exercised.

- A modern administrator whether in public or private sector leans on statistical data to provide a factual basis for making appropriate decision.
- A politician uses statistics advantageously to lend support and credence to his arguments while elucidating the problems he handles.
- A businessman, an industrial and a research worker all employ statistical methods in their work. Banks, Insurance companies and Government all have their statistics departments.
- A social scientist uses statistical methods in various areas of socio-economic life a nation

Uncertainty refers to the incompleteness and the instability of data available. I does not imply ignorance. Decision in statistics are made on the basis of sample evidence, as sample cannot have absolutely certain information.

A statistic is a quantity that is calculated from a sample data. It is used to provide information about unknown values in the corresponding population from which sample is drawn. For example, the average (mean value) of the data in a sample is used to give information about the overall average in the population from which that sample was taken.

It is possible to draw more than one sample from the same population and in general, that's why the value of a statistic will vary from sample to sample. i.e. The average values in more than one sample, drawn from the same population, will not necessarily be equal. Statistics are often assigned Roman letters or English letters (e.g. $\overline{X}$,$\sigma^2$).

A parameter is a quantity that is calculated from population data. Parameters are denoted by Greek letters. For example Population mean is represented $\mu$ and Population Variance is represented by $\sigma^2$. Parameter is constant because it does not change from the same population.

A population is any entire collection of people, animals, plants or things from which we may collect data. It is the entire group we are interested in, from which we wish to describe or draw conclusions about.
In order to make any generalizations about a population, a sample, that is meant to be representative of the population, is often studied.
For each population there are many possible samples.

**Example**

Suppose we want to get information from the students of first year class of any college, then all the students of first year class of that college is our population.

A sample is a group of units selected from a larger group (called population), i.e. it is small part of population. To draw valid conclusions about the
population under study, sample(s) is/are generally selected because usually the population is too large to study.
The sample should be representative of the population. Samples are selected by using some sampling techniques. Also, before collecting the
sample, it is important that the researcher carefully and completely defines the population, including a description of the members to be
included.

**Example **

The population for a study of infant health might be all children born in the Pakistan in 2000. The sample might be all babies born
on 7th May of the year 2000.

**Nominal Scale**The classification or grouping of the observations into mutually exclusive qualitative categories or classes is said to constitute a
nominal scale.

**Example: ** Students can be classified as male and female. 1 and 2 can be used as numerical code to identify these two categories.

**Ordinal Scale** It includes the characteristic of a nominal scale and in addition has the property of ordering or ranking of measurements.
For example, the performance of students (or players) is rated as excellent, good fair or poor, etc. Number 1, 2, 3, 4 etc. are also used to
indicate ranks. The only relation that holds between any pair of categories is that of "greater than" (or more preferred)

In *Descriptive Statistics * we usually use the graphical and numerical techniques to organize, summarize and present the information
contained in a data set in informative way. Its grounds are measure of central tendency and measure of dispersion.

In *Inferential/ Inductive Statistics *
we use the sample data to make decisions or predictions about a larger population of data.

A characteristic that differs either in quality or quantity from object to object, place to place, time to time is called variable. For example, beauty, intelligence, age, year, inflation, etc are example of variable.

A characteristic that does not change/ differ in quantity from object to object, place to place, time to time is called constant. For example, Total marks of a paper and number of days in a week.

**Quantitative Variable:**

A variable that can be measured numerically, such as heights, yield, age, weight etc. Data collected on such a variable are called

*Quantitative Data*. Or A characteristic that differs in quality from object to object and can be measured is called qualitative variable or attribute.*Quantitative Variable*can be classified in two types:-
- Continuous variables:

A variable that can assume any numerical value. 1.34, 2.45 - Discrete variables:

A variable whose values can be countable. 12, 17, 80.

This is true for any variable where we can say "how much?" or "how many?"

- Continuous variables:
**Qualitative Variable:**

A variable that cannot assume a numerical value but can be classified into non-numeric categories for example gender, hair color, health status, and beauty. Data collected on such a variable is called a

*qualitative data*. A qualitative characteristic is also called an attribute.

Continuous data can increase or decrease continuously.

**DATA** is collection of facts & figure for specific purpose, organized for analysis or used to reason or make decisions.

**Attributes: ** Qualitative characteristics of variable is also called attributes for example poverty, intelligence etc. are attributes.

*Primary data: * Initially collected data in its first creative form is called primary data. It is also called raw data or ungrouped data.
For example, data collected by NADRA to issue computerized identity cards.

*Secondary data * are data that already exist in industry-specific reports, previous research on the topic of interest, or data from an organization's
own database. Qualitative sources of secondary data include magazine and newspaper articles and annual reports of industry participants. However, secondary data can also provide a rich source of information. Or Primary data changed into any other form according to the requirement of the investigator is called secondary data. Secondary data is not collected but it is obtained by changing the form of primary data.

*Grouped data: *The data presented in the form of frequency distribution is also known as grouped data.

*Raw data: * Data that have not been processed in any manner. It often refers to uncompressed text that is not stored in any priority format.
It may also refer to recently captured data that may have been placed into a database structure, but not yet processed.

- Direct Personal Inquiry
- Indirect Personal Inquiry
- Questionnaire Method
- Through Local Sources
- Collection through Enumerators.

- Government Officers e.g. Bureau of Statistics
- Semi-Government Officers e.g. PIA, Banks etc
- Publications of Research Organizations
- Journals, Magazines and Newspapers etc.

**Observation:**In statistics, an *observation* often means any sort of numerical information recorded, it may be a physical measurement such as height, weight, age etc; a classification (grouping) such as heads or tails; or an answer to a question such as yes or no.

- Banks
- Insurance Companies
- Bureau of Statistics
- Research Institutes