Introduction to SPSS Statistics Software

SPSS is a statistical software package that is used to analyze the data (either in quantitative or qualitative form.) and it also helps to interpret the findings. SPSS stands for Statistical Packages for Social Science.

Introduction to SPSS Statistics

In 2009, SPSS was acquired by IBM. Now, the versions of SPSS are being named “IBM SPSS Statistics”, version 27.

Introduction to SPSS Statistics

SPSS software is used by insurance, banking, telecom, retail, consumer package Goods, market research, health research, survey companies, government (election, population, plan), education system and students researchers, finance, etc. to analyze data. SPSS is capable of analyzing a large amount of data and creating tables and graphs.

SPSS software is used for statistical tests because sometimes it is hard to deal with a large amount of data and perform different mathematical and statistical equations by hand. So, it is helpful for us, it also helps us to interpret the results, check normality, testing of hypotheses, computation of different averages, plot simple to complex graphs, and so on. SPSS offers a wide range of statistical methods. Some examples are:

1) Helps to define and show missing values in the data

Introduction to SPSS Statistics Software

2) Compute Descriptive Statistics such as Frequency Distribution

Analyze > Descriptive statistics > Frequency > statistics

Introduction to SPSS Statistics Software

On entered data, and for selected variables, one may get appropriate and required measures such as mean, sum, mode, percentiles, quartiles, variance, range, and other measures of dispersion, skewness, kurtosis, etc.

Statistical Techniques in SPSS

Descriptive Statistics

Different Statistics can be performed such as Cross Tabulation, Frequency, Descriptive, Explore, and Descriptive Ratio Statistics. All these options contain relevant statistical measures such as measures of central tendency, measures of dispersion, measures of position, measures for identification of shape of distribution, etc.

Inferential Statistics

Inferential statistics from basic to advanced can also be performed in SPSS Software.

Estimation: Confidence Interval (lower and upper limits) and point estimation (single value).

Hypothesis Testing:

Differences Between Groups: Independent Sample t-test, Paired Sample t-test, One-Way ANOVA, Two-Way ANOVA, Chi-Squared Test for Homogeneity, etc.

Correlation Association: Pearson’s Correlation, Spearman Correlation, Chi-Squared Test of Association, Fisher Exact Test of Independence, Odd Ratio, Relative Risk.

Regression Model and Prediction: Linear Regression models, such as Simple and Multiple Regression, Step-Wise Regression, Logistic Regression, Poisson Regression, etc.

Complex Sample and Testing: Compute Statistics and Standard Error by Complex Sample Design, Visualizes and Explores Complex Categorization, Imputes Missing Values through Statistical Algorithms.

Graphs and Data Visualizations: Line, Chart, Histogram, Bar Chart, Pie Chart, Scatter Plot, Box Plot, Area Chart, Q-Q Plot, Simple 3D Bar Chart, Population Pyramid, Frequency Polygon.

So we say that the SPSS software plays a significant role in the process of analyzing and interpreting the data with the help of statistical features and methods.

For different SPSS Software Tutorials, see the following links:

Introduction to R Language

Online MCQs Quiz Website

Performing Chi Square test from Crosstabs in SPSS

In this post, we will learn about “performing Chi Square Test” in SPSS Statistics Software. For this purpose, from the ANALYSIS menu of SPSS, the crosstabs procedure in descriptive statistics is used to create contingency tables also known as two-way frequency tables, cross-tabulation, which describe the association between two categories of variables.

In a crosstab, the categories of one variable determine the rows of the contingency table, and the categories of the other variable determine the columns. The contingency table dimensions can be reported as $R\times C$, where $R$ is the number of categories for the row variables, and $C$ is the number of categories for the column variable. Additionally, a “square” crosstab is one in which the row and column variables have the same number of categories. Tables of dimensions $2 \times 2$, $3\times 3$, $4\times 4$, etc., are all square crosstab.

Performing Chi Square Test in SPSS

Let us start performing Chi Square test on cross-tabulation in SPSS, first, click Analysis from the main menu, then Descriptive Statistics, and then Crosstabs, as shown in the figure below

Performing Chi Square Test Crosstabs in SPSS

As an example, we are using the “satisf.sav” data file that is already available in the SPSS installation folder. Suppose, we are interested in finding the relationship between the “Shopping Frequency” and the “Made Purchase” variable. For this purpose, shift any one of the variables from the left pan to the right pan as row(s) and the other in the right pan as column(s). Here, we are taking “Shopping Frequency” as row(s) and “Made Purchase” as column(s) variables. Pressing OK will give the contingency table only.

Crosstabs in SPSS

The ROW(S) box is used to enter one or more variables to be used in the cross-table and Chi-Square statistics. Similarly, the COLUMNS(S) box is used to enter one or more variables to be used in the cross-table and Chi-Square statistics. Note At least one row and one column variable should be used.

The layer box is used when you need to find the association between three or more variables. When the layer variable is specified, the crosstab between the row and the column variables will be created at each level of the layer variable. You can have multiple layers of variables by specifying the first layer variable and then clicking next to specify the second layer variable. Alternatively, you can try out multiple variables as single layers at a time by putting them all in layer 1 of 1 box.

The STATISTICS button will lead to a dialog box that contains different inferential statistics for finding the association between categorical variables.

The CELL button will lead to a dialog box that controls which output is displayed in each crosstab cell, such as observed frequency, expected frequency, percentages, residuals, etc., as shown below.

Crosstabs cell display

Performing Chi Square test on the selected variables, click on the “Statistics” button and choose (tick) the option of “Chi-Square” from the top-left side of the dialog box shown below. Note the Chi-square check box must have a tick in it, otherwise only a cross-table will be displayed.

Crosstabs Chi-Square Statistics in SPSS

Press the “Continue” button and then the OK button. We will get output windows containing the cross-tabulation results in Chi-Square statistics as shown below

Crosstabs output SPSS windows

The Chi-Square results indicate an association between the categories of the “Sopping Frequency” variable and the “Made Purchase” variable since the p-value is smaller than say 0.01 level of significance.

For video lecture on Contingency Table and chi-square statistics, See the video lectures

See another video about the Contingency Table and Chi-Square Goodness of Fit Test

Learn How to perform data analysis in SPSS

Learn R Programming Language

Select Cases in SPSS

The post is about Select Cases in SPSS (IBM SPSS-Statistics) as sometimes you may be interested in analyzing the specific part (subpart) of the available dataset. For example, you may be interested in getting descriptive or inferential statistics for males and females separately. One may also be interested in a certain age range or may want to study (say) only non-smokers. In such cases, one may use Select Cases in SPSS.

Select Cases in SPSS: Step-by-Step Procedure

For illustrative purposes, I am using the “customer_dbase” file available in SPSS sample data files. I am assuming the gender variable to select male customers only and will present some descriptive statistics only for males. For this purpose follow these steps:

Step 1: Go to the Menu bar, select “Data” and then “Select Cases”.

Select Cases in SPSS - 1

Step 2: A new window called “Select Cases” will open.

Use of If statement for Select Cases in SPSS

Step 3: Tick the box called “If the condition is satisfied” as shown in the figure below.

Select Cases in SPSS - 2

Step 4: Click on the button “If” highlighted in the above picture.

Step 5: A new window called “Select Cases: If” will open.

Select Cases in SPSS - If Dialog box 3

Step 6: The left box of this dialog box contains all the variables from the data view. Choose the variable (using the left mouse button) that you want to select cases for and use the “arrow” button to move the selected variable to the right box.

Step 7: In this example, the variable gender (for which we want to select only men) is shifted from the left to the right box. In the right box, write “gender=0” (since men have the value 0 code in this dataset).

Select Cases in SPSS - with Condition

Step 8: Click on Continue and then the OK button. Now, only men are selected (and the women’s data values are temporarily filtered out from the dataset).

Re-Select Cases in SPSS

Note: To “re-select” all cases (complete dataset), you carry out the following steps:

Step a: Go to the Menu bar, choose “Data” and then “Select Cases”.

Step b: From the dialog box of “Select Cases”, tick the box called “All cases”, and then click on the OK button. 

Select Cases in SPSS - data 5

When you use the Select Cases in SPSS, a new variable called “filter” will be created in the dataset. Deleting this filter variable, the selection will disappear. The “un-selected” cases are crossed over in the data view windows.

Select Cases in SPSS - data view 6

Note: The selection will be applied to everything you do from the point you select cases until you remove the selection. In other words, all statistics, tables, and graphs will be based only on the selected individuals until you remove (or change) the selection.

Random Sample of Cases

There is another kind of selection too. For example, the random sample of cases, based on time or case range, and use the filter variable. The selected case can be copied to a new dataset or unselected cases can be deleted. For this purpose choose the appropriate option from the output section of the select cases dialog box.

Select Cases in SPSS - random selection 7

For other SPSS tutorials Independent Sample t-tests in SPSS

Hypothesis Testing in R Programming Language

Cronbach’s Alpha Reliability Analysis of Measurement Scales

Cronbach’s Alpha Reliability Analysis

Cronbach’s Alpha Reliability analysis is used to study the properties of measurement scales (Likert scale questionnaire) and the items (questions) that make them up. The reliability analysis method computes several commonly used measures of scale reliability. The reliability analysis also provides information about the relationships between individual items in the scale. The intraclass correlation coefficients can be used to compute the interrater reliability estimates.

Consider that you want to know if my questionnaire measures customer satisfaction in a useful way. For this purpose, you can use the reliability analysis to determine the extent to which the items (questions) in your questionnaire are correlated with each other. The overall index of the reliability or internal consistency of the scale as a whole can be obtained. You can also identify problematic items that should be removed (deleted) from the scale.

As an example open the data “satisf.save” already available in SPSS sample files. To check the reliability of Likert scale items follow the steps given below:

Cronbach's Alpha Reliability
Cronbach's Alpha Reliability Analysis Dialog box

Step 1: On the Menu bar of SPSS, Click Analyze > Scale > Reliability Analysis… option

Step 2: Select two more variables that you want to test and shift them from the left pan to the right pan of the reliability analysis dialogue box. Note, that multiple variables (items) can be selected by holding down the CTRL key and clicking the variable you want. Clicking the arrow button between the left and right pan will shift the variables to the item pan (right pan).

Step 3: Click on the “Statistics” Button to select some other statistics such as descriptives (for item, scale, and scale if item deleted), summaries (for means, variances, covariances, and correlations), inter-item (for correlations and covariances) and ANOVA table (for none, F-test, Friedman chi-square and Cochran chi-square) statistics etc.

Reliability Statistics

Click on the “Continue” button to save the current statistics options for analysis. Click the OK button in the Reliability Analysis dialogue box to get the analysis to be done on selected items. The output will be shown in SPSS output windows.

Reliability Analysis Output

The Cronbach’s Alpha Reliability ($\alpha$) is about 0.827, which is good enough. Note that, deleting the item “organization satisfaction” will increase the reliability of remaining items to 0.860.

A rule of thumb for interpreting alpha for dichotomous items (questions with two possible answers only) or Likert scale items (questions with 3, 5, 7, or 9, etc items) is:

  • If Cronbach’s Alpha is $\ge 0.9$, the internal consistency of scale is Excellent.
  • If Cronbach’s Alpha is $0.90 > \alpha \ge 0.8$, the internal consistency of scale is Good.
  • If Cronbach’s Alpha is $0.80 > \alpha \ge 0.7$, the internal consistency of scale is Acceptable.
  • If Cronbach’s Alpha is $0.70 > \alpha \ge 0.6$, the internal consistency of scale is Questionable.
  • If Cronbach’s Alpha is $0.60 > \alpha \ge 0.5$, the internal consistency of scale is Poor.
  • If Cronbach’s Alpha is $0.50 > \alpha $, the internal consistency of scale is Unacceptable.

However, the rules of thumb listed above should be used with caution since Cronbach’s Alpha reliability is sensitive to the number of items in a scale. A larger number of questions can result in a larger Alpha Reliability, while a smaller number of items may result in smaller $\alpha$.

Online MCQs Test Preparation Website with Answers