Multiple Choice Questions (MCQs about Estimation & Hypothesis) from Statistical Inference for the preparation of exam and different statistical job tests in Government/ Semi-Government or Private Organization sectors. These tests are also helpful in getting admission to different colleges and Universities.

Statistical significance is important but not only the most important consideration in evaluating the results. Because statistical significance tells only the likelihood (probability) that the observed results are due to chance alone. It is important to consider the effect size when you obtain statistically significant results.

Effect size is a quantitative measure of some phenomenon. For example,

Correlation between two variables

The regression coefficients ($\beta_0, \beta_1, \beta_2$) for the regression model, for example, coefficients $\beta_1, \beta_2, \cdots$

The mean difference between two or more groups

The risk with which something happens

The effect size play an important role in power analysis, sample size planning and in meta-analysis.

Since effect size is an indicator of how strong (or how important) our results are. Therefore, when you are reporting results about statistical significant for an inferential test, the effect size should also be reported.

For the difference in means, the pooled standard deviation (also called combined standard deviation, obtained from pooled variance) is used to indicate the effect size. The effect size ($d$) for the difference in means by Cohens’s is

Cohen’s provided the rough guidelines for interpreting the effect size.

If $d=0.2$ the effect size will be considered as small.

For $d=0.5$ the effect size will be medium.

and if $d=0.8$ the effect size is considered as large.

Note that statistical significance is not the same as the effect size. The statistical significance tells how likely it is that the result is due to chance, while effect size tells how important the result is.

Also note that the statistical significance is not equal to economic, human, or scientific significance.

From the ANALYSIS menu of SPSS, the crosstabs procedure in descriptive statistics is used to create contingency tables also known as two-way frequency table, cross tabulation, which describe the association between two categories variables.

In a crosstab, the categories of one variable determine the rows of the contingency table, and the categories of the other variable determine the columns. The contingency table dimensions can be reported as $R\times C$, where $R$ is the number of categories for the row variables, and $C$ is the number of categories for the column variable. Additionally, a “square” crosstab is one in which the row and column variables have the same number of categories. Tables of dimensions $2 \times 2$, $3\times 3$, $4\times 4$, etc., are all square crosstab.

To perform Chi-Square test on cross-tabulation in SPSS, first click Analysis from main menu, then Descriptive Statistics and then crosstabs, as shown in figure below

As an example, we are using “satisf.sav” data file that is already available in SPSS installation folder. Suppose, we are interested in finding the relationship between “Shopping Frequency” and “Made Purchase” variable. For this purpose, shift any one of the variable from left pan to the right pan as row(s) and the other in right pan as column(s). Here, we are taking “Shopping Frequency” as row(s) and “Made Purchase” as column(s) variable. Pressing OK will give the contingency table only.

The ROW(S) box is used to enter one or more variables to be used in the cross-table and Chi-Square statistics. Similarly, the COLUMNS(S) box is used to enter one or more variables to be used in the cross-table and Chi-Square statistics. Note At least one row and one column variable should be used.

When you need to find the association between three or more variables the layer box is used. When the layer variable is specified, the crosstab between the row and the column variables will be created at each level of the layer variable. You can have multiple layers of variables by specifying the first layer variable and then clicking next to specify the second layer variable. Alternatively, you can try out multiple variables as single layers at a time by putting them all in layer 1 of 1 box.

The STATISTICS button will lead to a dialog box which contains different inferential statistics for finding the association between categorical variables.

The CELL button will lead to a dialog box which controls which output is displayed in each cell of the crosstab, such as observed frequency, expected frequency, percentages, and residuals, etc., as shown below.

To perform the Chi-Square test on the selected variables, click on “Statistics” button and choose (tick) the option of “Chi-Square” from the top-left side of the dialog box shown below. Note the Chi-square check box must have tick in it, otherwise only cross-table will be displayed.

Press “Continue” button and then OK button. We will get output windows containing the cross-tabulation results in Chi-Square statistics as shown below

The Chi-Square results indicate that there is association between categories of “Sopping Frequency” variable and “Made Purchase” variable, since, p-value is smaller than say 0.01 level of significance.

For video lecture on Contingency Table, Chi-Square statistics, See the video lectures

Contingency tables (also called two-way frequency tables or crosstabs or cross-tabulations) are used to find the relationship (association or dependencies) between two or more variables measured on the nominal or ordinal measurement scale.

A contingency table contains R rows and C columns measured, the order of contingency table is $R \times C$. There should be minimum of 2 (categories in row variable without row header) and 2 (categories in column variable without column header).

A cross table is created by listing all the categories (groups or levels) of one variable as rows in the table and the categories (groups or levels) of other (second) variable as columns, and then joint (cell) frequency (or counts) for each cell. The cell frequencies are totaled across both the rows and the columns. These totals (sums) are called marginal canadian pharmacy king frequencies. The sum (total) of columns sums (or rows sum) can be called a grand total and must be equal to $N$. The frequencies or counts in each sell is the observed frequency.

The next step in calculating the Chi-square statistics is the computation of the expected frequency for each cell of the contingency table. The expected values of each cell are computed by multiplying the marginal frequencies of the row and marginal frequencies of the column (row sums and columns sums are multiplied) and then divided by the total number of observations (grand total, $N$). It can be formulated as $Expected\,\, Frequency = \frac{(Row\,\, Total \,\, * \,\, Column\,\, Total)}{ Grand \,\, Total}$

The same procedures is used to compute the expected frequencies for all the cells of the contingency table.

The next step related to computation of amount of deviation or error for each cell. for this purpose the subtract the expected cell frequency from the observed cell frequency for each cell. The Chi-square statistic is computed by squaring the difference and then dividing the square of the difference by the expected frequency for each cell.

Finally the aggregate Chi-square statistic is computed by summing the Chi-square statistic. For formula is,
$\chi^2=\sum_{i=1}^n \frac{\left(O_{if}-E_{ij}\right)^2}{E_{ij}}$

The $\chi^2$ table value, the degrees of freedom and level of significance is required. The degrees of freedom for a contingency table is computed as $df=(number\,\, of \,\, rows – 1)(number \,\, of \,\, columns -1)$.

For further detail about the contingency table and its example about how to compute expected frequencies and Chi-Square statistics, see the video lecture