Quote of the Day

Statistics can be made to prove anything - even the truth.
~itfeature.com
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Introduction to Statistics

The word statistics was first used by a German scholar Gotifried Achenwall in the middle of the 18th century as the science of statecraft concerning the collection and use of data by the state.

The word statistics comes from the Latin word “Status” or Italian word “Statistia” or German word “Statistik” or the French word “Statistique”; meaning a political state, and originally meant information useful to the state, such as information about sizes of population (human, animal, products etc) and armed forces.

According to pioneer statistician Yule, the word statistics occurred at the earliest in the book “the element of universal erudition” by Baron (1770). In 1787 a wider definition used by E.A.W. Zimmermann in “A Political survey of the present state of Europe”. It appeared in encyclopedia of Britannica in 1797 and was used by Sir John Sinclair in Britain in a series of volumes published between 1791 and 1799 giving a statistical account of Scotland. In the 19th century, the word statistics acquired a wider meaning covering numerical data of almost any subject whatever and also interpretation of data through appropriate analysis.

Now statistics is being used in different meanings.

  • Statistics refers to “numerical facts that are arranged systematically in the form of tables or charts etc. In this sense it is always used a plural i.e. a set of numerical information. For instance statistics of prices, road accidents, crimes, births, educational institutions etc.
  • The word statistics is defined as a discipline that includes procedures and techniques used to collect, process and analyze the numerical data to make inferences and to reach appropriate decision in situation of uncertainty (uncertainty refers to incompleteness, it does not imply ignorance). In this sense word statistic is used in the singular sense. It denotes the science of basing decision on numerical data.
  • The word statistics are numerical quantities calculated from sample observations; a single quantity calculated from sample observations is called statistics such as mean. Here word statistics is plural.

“We compute statistics from statistics by statistics”

The first place of statistics is plural of statistics, in second place is plural sense data and in third place in singular sense methods.

VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Difference between a probability value and the significance level?

Basically in hypothesis testing the goal is to see if the probability value is less than or equal to the significance level (i.e., is p ≤ alpha).

  • The probability value (also called the p-value) is the probability of the observed result found in your research study of occurring (or an even more extreme result occurring), under the assumption that the null hypothesis is true (i.e., if the null were true).
  • In hypothesis testing, the researcher assumes that the null hypothesis is true and then sees how often the observed finding would occur if this assumption were true (i.e., the researcher determines the p-value).
  • The significance level (also called the alpha level) is the cutoff value the researcher selects and then uses to decide when to reject the null hypothesis.
  • Most researchers select the significance or alpha level of .05 to use in their research; hence, they reject the null hypothesis when the p-value is less than or equal to .05.
  • The key idea of hypothesis testing it that you reject the null hypothesis when the p-value is less than or equal to the significance level of.05.
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Testing of Hypothesis

The researcher is similar to the prosecuting attorney is the sense that the researcher brings the null hypothesis “to trial” when she believes there is probability strong evidence against the null.

  • Just as the prosecutor usually believes that the person on trial is not innocent, the researcher usually believes that the null hypothesis is not true.
  • In the court system the jury must assume (by law) that the person is innocent until the evidence clearly calls this assumption into question; analogously, in hypothesis testing the researcher must assume (in order to use hypothesis testing) that the null hypothesis is true until the evidence calls this assumption into question.
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Multiple Regression Analysis

In this case the unstandardized multiple regression coefficient is interpreted as the predicted change in Y (i.e., the DV) given a one unit change in X (i.e., the IV) while controlling for the other independent variables included in the equation.

  • The regression coefficient in multiple regression is called the partial regression coefficient because the effects of the other independent variables have been statistically removed or taken out (“partialled out”) of the relationship.
  • If the standardized partial regression coefficient is being used, the coefficients can be compared for an indicator of the relative importance of the independent variables (i.e., the coefficient with the largest absolute value is the most important variable, the second is the second most important, and so on.)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Simple Regression Analysis

The basic or unstandardized regression coefficient is interpreted as the predicted change in Y (i.e., the DV) given a one unit change in X (i.e., the IV). It is in the same units as the dependent variable.

  • Note that there is another form of the regression coefficient that is important: the standardized regression coefficient. The standardized coefficient varies from –1.00 to +1.00 just like a simple correlation coefficient;
  • If the regression coefficient is in standardized units, then in simple regression the regression coefficient is the same thing as the correlation coefficient.
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

How do you write the null and alternative hypotheses for each of the following:

1) The t-test for independent samples,
2) One-way analysis of variance,
3) The t-test for correlation coefficients?,
4) The t-test for a regression coefficient.

In each of these, the null hypothesis says there is no relationship and the alternative hypothesis says that there is a relationship.

  1. In this case the null hypothesis says that the two population means (i.e., \mu_1 and  \mu_2) are equal; the alternative hypothesis says that they are not equal.
  2. In this case the null hypothesis says that all of the population means are equal; the alternative hypothesis says that at least two of the means are not equal.
  3. In this case the null hypothesis says that the population correlation (i.e., \rho) is zero; the alternative hypothesis says that it is not equal to zero.
  4. In this case the null hypothesis says that the population regression coefficient (\beta) is zero, and the alternative says that it is not equal to zero.
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Type I and Type II Errors

In hypothesis testing there are two possible errors we can make: Type I and Type II errors.

  • A Type I error occurs when your reject a true null hypothesis (remember that when the null hypothesis is true you hope to retain it).
  • A Type II error occurs when you fail to reject a false null hypothesis (remember that when the null hypothesis is false you hope to reject it).
  • The best way to allow yourself to set a low alpha level (i.e., to have a small chance of making a Type I error) and to have a good chance of rejecting the null when it is false (i.e., to have a small chance of making a Type II error) is to increase the sample size.
  • The key in hypothesis testing is to use a large sample in your research study rather than a small sample!

If you do reject your null hypothesis, then it is also essential that you determine whether the size of the relationship is practically significant

VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Type I Error

It has become part of the statistical hypothesis testing culture.

  • It is a longstanding convention.
  • It reflects a concern over making type I errors (i.e., wanting to avoid the situation where you reject the null when it is true, that is, wanting to avoid “false positive” errors).
  • If you set the significance level at .05, then you will only reject a true null hypothesis 5% or the time (i.e., you will only make a type I error 5% of the time) in the long run.
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Point and Interval Estimation

This is an opinion question.

  • Point estimation is nice because it provides an exact point estimate of the population value. It provides you with the single best guess of the value of the population parameter.
  •  Interval estimation is nice because it allows you to make statements of confidence that an interval will include the true population value.
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
 
Share
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)

Basics Statistics

The two general rules are

  • 1) If the mean is less than the median, the data are skewed to the left, and
  • 2) If the mean is greater than the median, the data are skewed to the right.

Therefore, if the mean is much greater than the median the data are probably skewed to the right.

VN:F [1.9.16_1159]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.16_1159]
Rating: 0 (from 0 votes)
Share
© 2012 itfeature.com Suffusion theme by Sayontan Sinha