SAS Functions and Procedures

Discover key differences between SAS functions and procedures, when to use SUM() vs. ‘+’ operator, and INPUT vs. INFILE statements in SAS Software. Learn with clear examples and practical use cases for efficient data analysis. Perfect for SAS beginners and professionals!

SAS Functions and Procedures Questions And Answers

What is the difference between SAS Functions and Procedures?

The SAS Functions and Procedures (PROCs) serve different purposes and operate in distinct ways. The breakdown of the key differences between SAS Functions and Procedures is:

SAS Functions

Perform computations or transformations on individual values (usually within a DATA step). The SAS Functions are used to (i) operate on single values or variables, (ii) return a single result for each function call, and (iii) are often used in assignment statements or expressions.

## SAS Functions Example

data example;
    x = SUM(10, 20, 30);  /* Returns 60 */
    y = UPCASE('hello');  /* Returns 'HELLO' */
    z = SUBSTR('SAS Programming', 1, 3);  /* Returns 'SAS' */
run;

The following are some important types of SAS Functions:

  • Numeric Functions (e.g., SUM(), MEAN(), ROUND())
  • Character Functions (e.g., UPCASE(), SUBSTR(), TRIM())
  • Date/Time Functions (e.g., TODAY(), MDY(), INTCK())
  • Statistical Functions (e.g., NORMAL(), RANUNI())

SAS Procedures (PROCs)

SAS procedures, or PROCs, are used to perform data manipulation, analysis, or reporting on entire datasets. The usage of PROCS is to (i) operate on entire datasets (not just single values), (ii) generate tables, reports, graphs, or statistical analyses, and (iii) execute in a PROC step, not a DATA step.

## SAS Procedures (PROCs) Examples

proc means data=sashelp.class;  /* Computes summary statistics */
    var age height weight;
run;

proc sort data=sashelp.class;  /* Sorts a dataset */
    by descending age;
run;

proc freq data=sashelp.class;  /* Generates frequency tables */
    tables sex age;
run;

The types of SAS Procedures are:

  • Data Management PROCs (e.g., PROC SORT, PROC TRANSPOSE)
  • Statistical PROCs (e.g., PROC MEANS, PROC REG, PROC ANOVA)
  • Reporting PROCs (e.g., PROC PRINT, PROC TABULATE, PROC REPORT)
  • Graphical PROCs (e.g., PROC SGPLOT, PROC GCHART)

What are the key differences between SAS Functions and SAS Procedures?

The following are the key differences between SAS Functions and SAS Procedures:

FeatureSAS FunctionsSAS Procedures (PROCs)
OperationWork on single values/variablesWork on entire datasets
ExecutionUsed in DATA stepsUsed in PROC steps
OutputReturns a single valueGenerates reports, tables, or datasets
ExamplesSUM(), UPCASE(), SUBSTR()PROC MEANS, PROC SORT, PROC FREQ
Usage ContextCalculations within a variableDataset processing & analysis

Describe when to use SAS Functions or SAS PROCs

  • Use Functions when you need to transform or compute values within a DATA step.
  • Use Procedures when you need to analyze, summarize, or manipulate entire datasets.

What is the Difference Between the “Sum” Function and using the “+” Operator in SAS?

In SAS, both the SUM function and the + Operators can be used to perform addition, but they behave differently in terms of handling missing values and syntax. The breakdown of the differences between the SUM Function and the + Operator is:

SUM Function (SUM())

The SUM Function is used to add values while ignoring missing values (.). The general syntax of the SUM Function in SAS is

sum(var1, var2, var3, ...)

The behaviour of the SUM() is that if any argument is non-missing, the result is the sum of non-missing values. If all arguments are missing, the result is missing (.). The SUM() Function is best for

  • Summing multiple variables where some may have missing values.
  • Avoiding unintended missing results due to missing data.
## SAS SUM() Function Example 

data example;
    a = 10;
    b = .;  /* missing */
    c = 30;
    sum_result = sum(a, b, c);  /* 10 + 30 = 40 (ignores missing) */
run;

+ Operator

The ‘+’ operator performs arithmetic addition but propagates missing values. The general syntax of the ‘+’ operator in SAS is

var1 + var2 + var3

The behaviour of ‘+’ is:

  • If any variable is missing, the result is missing (.).
  • Only works if all values are non-missing.

The use of ‘+’ operator is best for:

  • Cases where missing values should make the result missing (e.g., strict calculations).
## + Operator Example

data example;
    a = 10;
    b = .;  /* missing */
    c = 30;
    plus_result = a + b + c;  /* 10 + . + 30 = . (missing) */
run;

What are the Key Differences between the SUM() Function and the ‘+’ Operator in SAS?

FeatureSUM Function (SUM())+ Operator
Handling Missing ValuesIgnores missing values (10 + . = 10)Returns missing if any value is missing (10 + . = .)
Syntaxsum(a, b, c)a + b + c
Use CaseSumming variables where some may be missingStrict arithmetic (missing = invalid)
PerformanceSlightly slower (function call)Faster (direct operation)

When to Use the SUM() Function and ‘+’ Operator in SAS?

  • Use SUM() when:
    • You want to ignore missing values (e.g., calculating totals where some data is missing).
    • Example: total = sum(sales1, sales2, sales3);
  • Use + when:
    • Missing values should make the result missing (e.g., strict calculations where all inputs must be valid).
    • Example: net_pay = salary + bonus; (if bonus is missing, net_pay should also be missing).

What is the difference between the INPUT and INFILE statements?

In SAS, both the INPUT and INFILE statements are used to read data, but they serve different purposes and are often used together. Here’s a breakdown of their differences:

INFILE Statement

The INFILE Statement in SAS specifies the source file from which data is to be read. It is used to

  • Defines the external file (e.g., .txt, .csv, .dat) to be read.
  • Can include options to control how data is read (e.g., delimiters, missing values, encoding).

The general Syntax of the INFILE Statement in SAS is:

INFILE "file-path" <options>;

The Key Options of the INFILE Statement are:

  • DLM=’,’ (specifies delimiter, e.g., CSV files)
  • DSD (handles quoted values and missing data correctly)
  • FIRSTOBS=2 (skips the first line, e.g., headers)
  • MISSOVER (prevents SAS from moving to the next line if data is missing)
## INFILE Statement Example
    DATA sample;
      INFILE "/path/to/data.csv" DLM=',' DSD FIRSTOBS=2;
      INPUT name $ age salary;
    RUN;

INPUT Statement

The INPUT Statement defines how SAS reads raw data (variable names, types, and formats). It is used to

  • Maps raw data to SAS variables (numeric or character).
  • Specifies the layout of the data (column positions, delimiters, or formats).

The general Syntax of the INPUT Statement is

INPUT variable1 $ variable2 variable3 ...;

The types of Input Styles are:

  • List Input (space/comma-delimited): INPUT name $ age salary;
  • Column Input (fixed columns): INPUT name $ 1-10 age 11-13 salary 14-20;
  • Formatted Input (specific formats): INPUT name $10. age 2. salary 8.2;
## INPUT Statement Example

    DATA sample;
      INFILE "/path/to/data.txt";
      INPUT name $ age salary;
    RUN;
Statistics Data Analysis SAS Functions, SAS Procedures

Learn String Manipulation in R Language

Leave a Comment

Discover more from Statistics for Data Science & Analytics

Subscribe now to keep reading and get access to the full archive.

Continue reading