Discover key differences between SAS functions and procedures, when to use SUM() vs. ‘+’ operator, and INPUT vs. INFILE statements in SAS Software. Learn with clear examples and practical use cases for efficient data analysis. Perfect for SAS beginners and professionals!
Table of Contents
What is the difference between SAS Functions and Procedures?
The SAS Functions and Procedures (PROCs) serve different purposes and operate in distinct ways. The breakdown of the key differences between SAS Functions and Procedures is:
SAS Functions
Perform computations or transformations on individual values (usually within a DATA step). The SAS Functions are used to (i) operate on single values or variables, (ii) return a single result for each function call, and (iii) are often used in assignment statements or expressions.
## SAS Functions Example
data example;
x = SUM(10, 20, 30); /* Returns 60 */
y = UPCASE('hello'); /* Returns 'HELLO' */
z = SUBSTR('SAS Programming', 1, 3); /* Returns 'SAS' */
run;
The following are some important types of SAS Functions:
- Numeric Functions (e.g.,
SUM()
,MEAN()
,ROUND()
) - Character Functions (e.g.,
UPCASE()
,SUBSTR()
,TRIM()
) - Date/Time Functions (e.g.,
TODAY()
,MDY()
,INTCK()
) - Statistical Functions (e.g.,
NORMAL()
,RANUNI()
)
SAS Procedures (PROCs)
SAS procedures, or PROCs, are used to perform data manipulation, analysis, or reporting on entire datasets. The usage of PROCS is to (i) operate on entire datasets (not just single values), (ii) generate tables, reports, graphs, or statistical analyses, and (iii) execute in a PROC step, not a DATA step.
## SAS Procedures (PROCs) Examples
proc means data=sashelp.class; /* Computes summary statistics */
var age height weight;
run;
proc sort data=sashelp.class; /* Sorts a dataset */
by descending age;
run;
proc freq data=sashelp.class; /* Generates frequency tables */
tables sex age;
run;
The types of SAS Procedures are:
- Data Management PROCs (e.g.,
PROC SORT
,PROC TRANSPOSE
) - Statistical PROCs (e.g.,
PROC MEANS
,PROC REG
,PROC ANOVA
) - Reporting PROCs (e.g.,
PROC PRINT
,PROC TABULATE
,PROC REPORT
) - Graphical PROCs (e.g.,
PROC SGPLOT
,PROC GCHART
)
What are the key differences between SAS Functions and SAS Procedures?
The following are the key differences between SAS Functions and SAS Procedures:
Feature | SAS Functions | SAS Procedures (PROCs) |
---|---|---|
Operation | Work on single values/variables | Work on entire datasets |
Execution | Used in DATA steps | Used in PROC steps |
Output | Returns a single value | Generates reports, tables, or datasets |
Examples | SUM() , UPCASE() , SUBSTR() | PROC MEANS , PROC SORT , PROC FREQ |
Usage Context | Calculations within a variable | Dataset processing & analysis |
Describe when to use SAS Functions or SAS PROCs
- Use Functions when you need to transform or compute values within a DATA step.
- Use Procedures when you need to analyze, summarize, or manipulate entire datasets.
What is the Difference Between the “Sum” Function and using the “+” Operator in SAS?
In SAS, both the SUM function and the +
Operators can be used to perform addition, but they behave differently in terms of handling missing values and syntax. The breakdown of the differences between the SUM Function and the + Operator is:
SUM Function (SUM())
The SUM Function is used to add values while ignoring missing values (.). The general syntax of the SUM Function in SAS is
sum(var1, var2, var3, ...)
The behaviour of the SUM() is that if any argument is non-missing, the result is the sum of non-missing values. If all arguments are missing, the result is missing (.). The SUM() Function is best for
- Summing multiple variables where some may have missing values.
- Avoiding unintended missing results due to missing data.
## SAS SUM() Function Example
data example;
a = 10;
b = .; /* missing */
c = 30;
sum_result = sum(a, b, c); /* 10 + 30 = 40 (ignores missing) */
run;
+ Operator
The ‘+’ operator performs arithmetic addition but propagates missing values. The general syntax of the ‘+’ operator in SAS is
var1 + var2 + var3
The behaviour of ‘+’ is:
- If any variable is missing, the result is missing (.).
- Only works if all values are non-missing.
The use of ‘+’ operator is best for:
- Cases where missing values should make the result missing (e.g., strict calculations).
## + Operator Example
data example;
a = 10;
b = .; /* missing */
c = 30;
plus_result = a + b + c; /* 10 + . + 30 = . (missing) */
run;
What are the Key Differences between the SUM() Function and the ‘+’ Operator in SAS?
Feature | SUM Function (SUM() ) | + Operator |
---|---|---|
Handling Missing Values | Ignores missing values (10 + . = 10 ) | Returns missing if any value is missing (10 + . = . ) |
Syntax | sum(a, b, c) | a + b + c |
Use Case | Summing variables where some may be missing | Strict arithmetic (missing = invalid) |
Performance | Slightly slower (function call) | Faster (direct operation) |
When to Use the SUM() Function and ‘+’ Operator in SAS?
- Use
SUM()
when:- You want to ignore missing values (e.g., calculating totals where some data is missing).
- Example:
total = sum(sales1, sales2, sales3);
- Use
+
when:- Missing values should make the result missing (e.g., strict calculations where all inputs must be valid).
- Example:
net_pay = salary + bonus;
(ifbonus
is missing,net_pay
should also be missing).
What is the difference between the INPUT and INFILE statements?
In SAS, both the INPUT and INFILE statements are used to read data, but they serve different purposes and are often used together. Here’s a breakdown of their differences:
INFILE Statement
The INFILE Statement in SAS specifies the source file from which data is to be read. It is used to
- Defines the external file (e.g., .txt, .csv, .dat) to be read.
- Can include options to control how data is read (e.g., delimiters, missing values, encoding).
The general Syntax of the INFILE Statement in SAS is:
INFILE "file-path" <options>;
The Key Options of the INFILE Statement are:
- DLM=’,’ (specifies delimiter, e.g., CSV files)
- DSD (handles quoted values and missing data correctly)
- FIRSTOBS=2 (skips the first line, e.g., headers)
- MISSOVER (prevents SAS from moving to the next line if data is missing)
## INFILE Statement Example
DATA sample;
INFILE "/path/to/data.csv" DLM=',' DSD FIRSTOBS=2;
INPUT name $ age salary;
RUN;
INPUT Statement
The INPUT Statement defines how SAS reads raw data (variable names, types, and formats). It is used to
- Maps raw data to SAS variables (numeric or character).
- Specifies the layout of the data (column positions, delimiters, or formats).
The general Syntax of the INPUT Statement is
INPUT variable1 $ variable2 variable3 ...;
The types of Input Styles are:
- List Input (space/comma-delimited):
INPUT name $ age salary;
- Column Input (fixed columns):
INPUT name $ 1-10 age 11-13 salary 14-20;
- Formatted Input (specific formats):
INPUT name $10. age 2. salary 8.2;
## INPUT Statement Example
DATA sample;
INFILE "/path/to/data.txt";
INPUT name $ age salary;
RUN;
Learn String Manipulation in R Language