How to Convert Continuous Variables in SPSS: A Quick Guide

There may be situations in which one may want to convert continuous variables in SPSS to categorical. For example, one may want to find out how many females earn a starting salary of more than 80,000 using the data of the University of say Florida. For this numeric data, we need to change into categorical variables. In SPSS, this type of transformation is called the recoding of continuous variables to categorical.

Convert Continuous Variables in SPSS to Categorical

Step-by-Step Procedure

In SPSS there are three basic options for recoding the variables.

  • Recode into different variables
  • Recode into the same variable
  • DO IF syntax

Recode into different variables and DO IF syntax creates a new variable without modifying the original variable, while recode into the same variable will permanently overwrite the original variable. Best to record a variable into a different variable. To recode into different variables,

Click Transform > Recode into different variables

Convert Continuous Variables in SPSS to Categorical

The Recode into different variables dialog box will appear as:

Convert Continuous Variables to Categorical in SPSS Input Variable Output variable

The left-side pane of the dialog box lists all of the variables. Select the variable of interest to recode and move the variable to the right-side pane by clicking the arrow button in between the left and right-side dialog box. Let us have the salary variable to transform.

  • Input Variable -> Output
    The center text box lists the variables(s). In this case, we have only a salary variable.
  • Output Variable
    Define the name and label (label is optional) for your recoded variable(s) by typing them in the text field. The new name of the recoded variable (say) will be “new-salary” and then click change.
  • Old and New Variables
  • Click the “old and new values” to specify the categories of the selected variable. A new dialog box will appear, where one needs to specify how to transform the values will appear.
Convert Continuous Variables to Categorical in SPSS Old new Values

Old Values and New Values

The “Old -> New” box specifies the type of value of a recode variable. For example, the value of the recode variable (new value) is 1 or range of 20000 through the highest.

A short description of “Old Values” options.

  • Value:
    Enter a numeric code that represents the category. for example, give the value 1 for 1st category or group.
  • System Missing:
    Apply any system missing value(.).
  • Range or Through:
    This option is used to enter the lower and upper limits that should be coded. The recode category includes both limits (inclusive). For example, 20000 to 40000.
  • Range, Lowest through Value:
    Recode all values greater than or equal to some number.
  • All Other Values:
    Applies any value not explicitly accounted for by the previous recoding rules.

A short description of the “Old -> New” option:

Enter the required group/ category numerical code in the “New Value” and then click the add button below. Repeat this step for each group value that you wish to recode. All the required groups are recorded by adding an “Old -> New” box. Finally, click the continue button. Click the OK button to transform the continuous variable into a categorical variable.

https://gmstat.com, https://rfaqs.com

MCQs Data and Variable 14

The post is about MCQs Data and Variables. There are 20 multiple-choice questions related to variables, data, population, sample, and types of variables. Let us start with MCQs Data and Variable with Answers.

Online Multiple choice questions about Variable and Data with Answers

1. A data set is a:

 
 
 
 

2. In statistics, a sample means:

 
 
 
 

3. A statistician wants to determine the total annual medical costs incurred by all districts of Pakistan from 1981 to 2001 as a result of health problems related to smoking. He polls each of the districts annually to obtain health care expenditures, in dollars, on smoking-related illnesses. Which one of the following is not a true statement?

 
 
 
 

4. Cross-section data are collected:

 
 
 
 

5. Which one of the following is an example of qualitative data?

 
 
 
 

6. Which one of the following is an example of cross-section data?

 
 
 
 

7. A scientist is experimenting to determine the relationship between the consumption of a certain type of food and high blood pressure. He conducts a random sample on 2,000 people and first asks them a “yes” or “no” question: Do you eat this type of food more than once a week? He also takes the blood pressure of each person and records it (for example: 120/80). Which one of the following statements is true?

 
 
 
 

8. What tasks are involved in data cleaning? Select all that apply

 
 
 
 

9. In statistics, a population consists of:

 
 
 
 

10. What is the main objective of data cleaning?

 
 
 
 

11. An observation is the:

 
 
 
 

12. Variables whose measurement is done in terms such as weight, height, and length are classified as

 
 
 
 

13. A qualitative variable is the one that:

 
 
 
 

14. A variable is a:

 
 
 
 

15. When data are collected in a statistical study for only a portion or subset of all elements of interest we are using:

 
 
 
 

16. In statistics, conducting a survey means:

 
 
 
 

17. Time-series data are collected:

 
 
 
 

18. A quantitative variable is one that can:

 
 
 
 

19. Which one of the following is a continuous variable?

 
 
 
 

20. Government and non-government publications are considered as

 
 
 
 

MCQs Data and Variable with Answers

MCQs Data and Variable with answers
  • When data are collected in a statistical study for only a portion or subset of all elements of interest we are using:
  • In statistics, a population consists of:
  • In statistics, a sample means:
  • In statistics, conducting a survey means:
  • A data set is a:
  • A variable is a:
  • An observation is the:
  • A quantitative variable is one that can:
  • A qualitative variable is the one that:
  • Time-series data are collected:
  • Cross-section data are collected:
  • Which one of the following is an example of qualitative data?
  • Which one of the following is an example of cross-section data?
  • Which one of the following is a continuous variable?
  • What tasks are involved in data cleaning? Select all that apply
  • What is the main objective of data cleaning?
  • A statistician wants to determine the total annual medical costs incurred by all districts of Pakistan from 1981 to 2001 as a result of health problems related to smoking. He polls each of the districts annually to obtain health care expenditures, in dollars, on smoking-related illnesses. Which one of the following is not a true statement?
  • A scientist is experimenting to determine the relationship between the consumption of a certain type of food and high blood pressure. He conducts a random sample on 2,000 people and first asks them a “yes” or “no” question: Do you eat this type of food more than once a week? He also takes the blood pressure of each person and records it (for example: 120/80). Which one of the following statements is true?
  • Variables whose measurement is done in terms such as weight, height, and length are classified as
  • Government and non-government publications are considered as
Statistics Help: MCQs Data and Variable with Answres

https://gmstat.com, https://rfaqs.com

Critical Values and Rejection Region

In statistical hypotheses testing procedure, an important step is to determine whether to reject the null hypothesis. The step is to compute/find the critical values and rejection region.

Rejection Region and Critical Values

A rejection region for a hypothesis test is the range of values for the standardized test statistic which would lead us to decide whether to reject the null hypothesis. The Critical values for a hypothesis test are the z-scores which separate the rejection region(s) from the non-rejection region (also called the acceptance region of $H_0$).  The critical values will be denoted by $Z_0$.

The rejection region for a test is determined by the type of test (left-tailed, right-tailed, or two-tailed) and the level of significance (denoted by $\alpha$) for the test. For a left-tailed test, the rejection region is a region in the left tail of the normal distribution, for a right-tailed test, it is in the right tail, and for a two-tailed test, there are two equal rejection regions in either tail.

Hypothesis-Testing-Tails-Critical Values and Rejection Region

Once we establish the critical values and rejection region, if the standardized test statistics for a sample data set fall in the region of rejection, the null hypothesis is rejected.

Examples: Critical Values and Rejection Region

Example 1: A university claims that the average SAT score for its incoming freshmen is 1080. A sample of 56 freshmen at the university is drawn and the average SAT score is found to be $\overline{x}=1044$ with a sample standard deviation of $s=94.7$ points.

    In the above SAT example, the test is two-tailed, so the rejection region will be the two tails at either end of the normal distribution. If we again want $\alpha=0.05$, then the area under the curve in both rejection regions together should be 0.05. For this purpose, we will look up $\frac{\alpha}{2}=0.025$ in the standard normal table to get critical values of $Z_0 = \pm 1.96$. The rejection region thus consists of $Z \le 1.96$ and $Z\ge 1.96$. Since the standardized test statistic $Z=-2.85$ falls in the region, the university’s claim of $\mu = 1080$ would be rejected in this case.

    Example 2: Consider a left-tailed Z test. For a 0.05 level of significance, the rejection region would be the values in the lowest 5% of the standard normal distribution (5% lowest area under the normal curve). In this case, the critical value (the corresponding) Z-score will be $-1.645$. So the critical value $Z_0$ will be $-1.645$ and the rejection region will be $Z\le -1.645$.

    Note that for the case of right-tailed the rejection region would be the values in the highest 5% of the standard normal distribution table. The Z-score will be $1.645$ and the rejection region will be $Z\ge 1.645$.

    Hypothesis Test

    Exercise: Critical Values and Rejection Region

    1. Find the critical values and rejection regions(s) for the standardized Z-test of the following:
    • A right-tailed test with $\alpha = 0.05$
    • A left-tailed test with $\alpha = 0.01$
    • A two-tailed test with $\alpha = 0.10$
    • A right-tailed test with $\alpha = 0.02$
    1. Mercury levels in fish are considered dangerous to people if they exceed 0.5mg mercury per kilogram of meat. A sample of 50 tuna is collected, and the mean level of mercury in these 50 fishes is 0.6m/kg, with a standard deviation of 0.2mg/kg. A health warning will be issued if the claim that the mean exceeds 0.5mg/kg can be supported at the $\alpha=0.10$ level of significance. Determine the null and alternative hypotheses in this case, the type of the test, the critical value(s), and the rejection region. Find the standardized test statistics for the information given in the exercise. Should the health warning be issued?

    https://rfaqs.com, https://gmstat.com