SAS Functions and Procedures

Discover key differences between SAS functions and procedures, when to use SUM() vs. ‘+’ operator, and INPUT vs. INFILE statements in SAS Software. Learn with clear examples and practical use cases for efficient data analysis. Perfect for SAS beginners and professionals!

SAS Functions and Procedures Questions And Answers

What is the difference between SAS Functions and Procedures?

The SAS Functions and Procedures (PROCs) serve different purposes and operate in distinct ways. The breakdown of the key differences between SAS Functions and Procedures is:

SAS Functions

Perform computations or transformations on individual values (usually within a DATA step). The SAS Functions are used to (i) operate on single values or variables, (ii) return a single result for each function call, and (iii) are often used in assignment statements or expressions.

## SAS Functions Example

data example;
    x = SUM(10, 20, 30);  /* Returns 60 */
    y = UPCASE('hello');  /* Returns 'HELLO' */
    z = SUBSTR('SAS Programming', 1, 3);  /* Returns 'SAS' */
run;

The following are some important types of SAS Functions:

  • Numeric Functions (e.g., SUM(), MEAN(), ROUND())
  • Character Functions (e.g., UPCASE(), SUBSTR(), TRIM())
  • Date/Time Functions (e.g., TODAY(), MDY(), INTCK())
  • Statistical Functions (e.g., NORMAL(), RANUNI())

SAS Procedures (PROCs)

SAS procedures, or PROCs, are used to perform data manipulation, analysis, or reporting on entire datasets. The usage of PROCS is to (i) operate on entire datasets (not just single values), (ii) generate tables, reports, graphs, or statistical analyses, and (iii) execute in a PROC step, not a DATA step.

## SAS Procedures (PROCs) Examples

proc means data=sashelp.class;  /* Computes summary statistics */
    var age height weight;
run;

proc sort data=sashelp.class;  /* Sorts a dataset */
    by descending age;
run;

proc freq data=sashelp.class;  /* Generates frequency tables */
    tables sex age;
run;

The types of SAS Procedures are:

  • Data Management PROCs (e.g., PROC SORT, PROC TRANSPOSE)
  • Statistical PROCs (e.g., PROC MEANS, PROC REG, PROC ANOVA)
  • Reporting PROCs (e.g., PROC PRINT, PROC TABULATE, PROC REPORT)
  • Graphical PROCs (e.g., PROC SGPLOT, PROC GCHART)

What are the key differences between SAS Functions and SAS Procedures?

The following are the key differences between SAS Functions and SAS Procedures:

FeatureSAS FunctionsSAS Procedures (PROCs)
OperationWork on single values/variablesWork on entire datasets
ExecutionUsed in DATA stepsUsed in PROC steps
OutputReturns a single valueGenerates reports, tables, or datasets
ExamplesSUM(), UPCASE(), SUBSTR()PROC MEANS, PROC SORT, PROC FREQ
Usage ContextCalculations within a variableDataset processing & analysis

Describe when to use SAS Functions or SAS PROCs

  • Use Functions when you need to transform or compute values within a DATA step.
  • Use Procedures when you need to analyze, summarize, or manipulate entire datasets.

What is the Difference Between the “Sum” Function and using the “+” Operator in SAS?

In SAS, both the SUM function and the + Operators can be used to perform addition, but they behave differently in terms of handling missing values and syntax. The breakdown of the differences between the SUM Function and the + Operator is:

SUM Function (SUM())

The SUM Function is used to add values while ignoring missing values (.). The general syntax of the SUM Function in SAS is

sum(var1, var2, var3, ...)

The behaviour of the SUM() is that if any argument is non-missing, the result is the sum of non-missing values. If all arguments are missing, the result is missing (.). The SUM() Function is best for

  • Summing multiple variables where some may have missing values.
  • Avoiding unintended missing results due to missing data.
## SAS SUM() Function Example 

data example;
    a = 10;
    b = .;  /* missing */
    c = 30;
    sum_result = sum(a, b, c);  /* 10 + 30 = 40 (ignores missing) */
run;

+ Operator

The ‘+’ operator performs arithmetic addition but propagates missing values. The general syntax of the ‘+’ operator in SAS is

var1 + var2 + var3

The behaviour of ‘+’ is:

  • If any variable is missing, the result is missing (.).
  • Only works if all values are non-missing.

The use of ‘+’ operator is best for:

  • Cases where missing values should make the result missing (e.g., strict calculations).
## + Operator Example

data example;
    a = 10;
    b = .;  /* missing */
    c = 30;
    plus_result = a + b + c;  /* 10 + . + 30 = . (missing) */
run;

What are the Key Differences between the SUM() Function and the ‘+’ Operator in SAS?

FeatureSUM Function (SUM())+ Operator
Handling Missing ValuesIgnores missing values (10 + . = 10)Returns missing if any value is missing (10 + . = .)
Syntaxsum(a, b, c)a + b + c
Use CaseSumming variables where some may be missingStrict arithmetic (missing = invalid)
PerformanceSlightly slower (function call)Faster (direct operation)

When to Use the SUM() Function and ‘+’ Operator in SAS?

  • Use SUM() when:
    • You want to ignore missing values (e.g., calculating totals where some data is missing).
    • Example: total = sum(sales1, sales2, sales3);
  • Use + when:
    • Missing values should make the result missing (e.g., strict calculations where all inputs must be valid).
    • Example: net_pay = salary + bonus; (if bonus is missing, net_pay should also be missing).

What is the difference between the INPUT and INFILE statements?

In SAS, both the INPUT and INFILE statements are used to read data, but they serve different purposes and are often used together. Here’s a breakdown of their differences:

INFILE Statement

The INFILE Statement in SAS specifies the source file from which data is to be read. It is used to

  • Defines the external file (e.g., .txt, .csv, .dat) to be read.
  • Can include options to control how data is read (e.g., delimiters, missing values, encoding).

The general Syntax of the INFILE Statement in SAS is:

INFILE "file-path" <options>;

The Key Options of the INFILE Statement are:

  • DLM=’,’ (specifies delimiter, e.g., CSV files)
  • DSD (handles quoted values and missing data correctly)
  • FIRSTOBS=2 (skips the first line, e.g., headers)
  • MISSOVER (prevents SAS from moving to the next line if data is missing)
## INFILE Statement Example
    DATA sample;
      INFILE "/path/to/data.csv" DLM=',' DSD FIRSTOBS=2;
      INPUT name $ age salary;
    RUN;

INPUT Statement

The INPUT Statement defines how SAS reads raw data (variable names, types, and formats). It is used to

  • Maps raw data to SAS variables (numeric or character).
  • Specifies the layout of the data (column positions, delimiters, or formats).

The general Syntax of the INPUT Statement is

INPUT variable1 $ variable2 variable3 ...;

The types of Input Styles are:

  • List Input (space/comma-delimited): INPUT name $ age salary;
  • Column Input (fixed columns): INPUT name $ 1-10 age 11-13 salary 14-20;
  • Formatted Input (specific formats): INPUT name $10. age 2. salary 8.2;
## INPUT Statement Example

    DATA sample;
      INFILE "/path/to/data.txt";
      INPUT name $ age salary;
    RUN;
Statistics Data Analysis SAS Functions, SAS Procedures

Learn String Manipulation in R Language

PROC in SAS

This comprehensive Q&A-style guide about PROC in SAS Software breaks down fundamental SAS PROCs used in statistical analysis and data management. Learn:
What PROCs do and their key functions.
Differences between PROC MEANS & SUMMARY.
When to use PROC MIXED for mixed-effects models.
CANCORR vs CORR for multivariate vs bivariate analysis.
Sample PROC MIXED code with required statements.
How PROC PRINT & CONTENTS help inspect data.

Ideal for students learning SAS Programming and statisticians performing advanced analyses. Includes ready-to-use code snippets and easy comparisons!

Q&A PROC in SAS Software

Explain the functions of PROC in SAS.

PROC (Procedure) is a fundamental component of SAS programming that performs specific data analysis, reporting, or data management tasks. Each PROC is a pre-built routine designed to handle different statistical, graphical, or data processing operations. The key functions of PROC in SAS are:

  • Data Analysis & Statistics: PROCs perform statistical computations, including:
    • Descriptive Statistics (PROC MEANS, PROC SUMMARY, PROC UNIVARIATE)
    • Hypothesis Testing (PROC TTEST, PROC ANOVA, PROC GLM)
    • Regression & Modeling (PROC REG, PROC LOGISTIC, PROC MIXED)
    • Multivariate Analysis (PROC FACTOR, PROC PRINCOMP, PROC DISCRIM)
  • Data Management & Manipulation
    • Sorting (PROC SORT)
    • Transposing Data (PROC TRANSPOSE)
    • Merging & Combining Datasets (PROC SQL, PROC APPEND)
  • Reporting & Output Generation
    • Printing Data (PROC PRINT)
    • Creating Summary Reports (PROC TABULATE, PROC REPORT)
    • Generating Graphs (PROC SGPLOT, PROC GCHART)
  • Quality Control & Data Exploration
    • Checking Data Structure (PROC CONTENTS)
    • Identifying Missing Data (PROC FREQ with MISSING option)
    • Sampling Data (PROC SURVEYSELECT)
  • Advanced Analytics & Machine Learning
    • Cluster Analysis (PROC CLUSTER)
    • Time Series Forecasting (PROC ARIMA)
    • Text Mining (PROC TEXTMINER)

    PROCs are the backbone of SAS programming, enabling data analysis, manipulation, and reporting with minimal coding. Choosing the right PROC depends on the task—whether it’s statistical modeling, data cleaning, or generating business reports.

    Explain the Difference Between PROC MEANS and PROC SUMMARY.

    Both PROC MEANS and PROC SUMMARY in SAS compute descriptive statistics (e.g., mean, sum, min, max), but they differ in default behavior and output:

    • Default Output
      PROC MEANS: Automatically prints results in the output window.
      PROC SUMMARY: Does not print by default; requires the PRINT option.
    • Dataset Creation
      Both can store results in a dataset using OUT=.
    • Handling of N Observations
      PROC MEANS: Includes a default N (count) statistic.
      PROC SUMMARY: Requires explicit specification of statistics.
    • Usage Context
      Use PROC MEANS for quick interactive analysis.
      Use PROC SUMMARY for programmatic, non-printed summaries.

    The PROC MEANS is more user-friendly for direct analysis, while PROC SUMMARY in SAS offers finer control for automated reporting.

    Under the PROC MEANS, there is only a subgroup that is created only when there is a BY statement that is being used, and the input data is previously well-sorted out with the help of BY variables.

    Under the PROC SUMMARY in SAS, there is a statistic that gets produced automatically for all the subgroups. It gives all sorts of information that runs together.

    Introduction to PROC in SAS Software

    What is the PROC MIXED Procedure in SAS STAT used for?

    The PROC blended system in SAS/STAT fits specific blended models. The Mixed version can allow for one-of-a-kind assets of variation in information, it allows for one-of-a-kind variances for corporations, and takes into account the correlation structure of repeated measurements.

    PROC MIXED is essential for analyzing data with correlated observations or hierarchical structures. Its flexibility in modeling random effects and covariance makes it a cornerstone of advanced statistical analysis in SAS.

    PROC MIXED is a powerful SAS procedure for fitting linear mixed-effects models, which account for both fixed and random effects in data. It is widely used for analyzing hierarchical, longitudinal, or clustered data where observations are correlated (e.g., repeated measures, multilevel data).

    What is the Difference Between CANCORR and CORR Procedures in SAS STAT?

    Both procedures analyze relationships between variables, but they serve distinct purposes:

    1. PROC CORR (Correlation Analysis): Computes simple pairwise correlations (e.g., Pearson, Spearman). It is used to examine linear associations between two or more variables or when there is no distinction between dependent/independent variables. The output from different statistical software is in the form of the correlation matrix, p-values, and descriptive statistics. The code below tests how height, weight, and age are linearly related.

      PROC CORR DATA=my_data;
      VAR height weight age;
      RUN;
    2. PROC CANCORR (Canonical Correlation Analysis): Analyzes multivariate relationships between two sets of variables. It is used to find linear combinations (canonical variables) that maximize correlation between sets.
      It is also useful for dimension reduction (e.g., linking psychological traits to behavioral measures). The output from different statistical software is Canonical correlations, coefficients, and redundancy analysis.

      PROC CANCORR DATA=my_data;
      VAR set1_var1 set1_var2; /* First variable set */
      WITH set2_var1 set2_var2; /* Second variable set */
      RUN;

    Key Differences Summary

    FeaturePROC CORRPROC CANCORR
    Analysis TypeBivariate correlationsMultivariate (set-to-set)
    VariablesSingle list (no grouping)Two distinct sets (VAR & WITH)
    Output FocusPairwise coefficients (e.g., r)Canonical correlations (ρ)
    ComplexitySimple, descriptiveAdvanced, inferential

    Write a sample program using the PROC MIXED procedure, including all the required statements

    proc mixed data=SASHELP.IRIS plots=all;
    class species;
    model petallength= /;
    run;

    Describe what PROC PRINT and PROC CONTENTS are used for.

    PROC contents displays the information about an SAS dataset, while PROC print ensures that the data is correctly read into the SAS dataset.

    1. PROC CONTENTS: Displays metadata about a SAS dataset (structure, variables, attributes). Its key uses are:
    • Check variable names, types (numeric/character), lengths, and formats.
    • Identify dataset properties (e.g., number of observations, creation date).
    • Debug data import/export issues (e.g., mismatched formats).

    The general syntax of PROC CONTENTS is

    PROC CONTENTS DATA=your_data;  
    RUN;
    1. PROC PRINT: Displays raw data from a SAS dataset to the output window. Its key uses are:
    • View actual observations and values.
    • Verify data integrity (e.g., missing values, unexpected codes).
    • Quick preview before analysis.

    The general Syntax of PROC PRINT is

    PROC PRINT DATA=your_data (OBS=10);  /* Prints first 10 rows */ 
       VAR var1 var2;                                              /* Optional: limit columns */
    RUN;

    Functions in R Programming

    Functions in SAS

    The post is about Functions in SAS Software. Functions in SAS software are predefined routines that perform specific computations or transformations on data. They can be categorized into several types based on their functionality.

    Introduction to Functions in SAS Software

    SAS functions are predefined operations that perform specific computations on data, categorized by their purpose. Numeric functions handle mathematical calculations like rounding, summing, and logarithms. Character functions manipulate text data through substring extraction, case conversion, and concatenation. Date and time functions manage SAS date, time, and datetime values, enabling operations like extracting year/month/day or shifting dates by intervals.

    In SAS, Statistical functions compute summary metrics such as mean, median, and standard deviation. Financial functions support business calculations like net present value and loan payments. Random number functions generate values from statistical distributions for simulations. Bitwise functions perform low-level binary operations. Array functions assist in managing array dimensions and bounds. Special functions include utilities for data type conversion and lagged value retrieval. Finally, file and I/O functions check file existence and manage input/output operations. Together, these functions streamline data processing, analysis, and reporting in SAS.

    Here are the main types of functions in SAS Software:

    Numeric Functions

    Perform mathematical operations on numeric values. These functions are also called arithmetic functions.

    FunctionShort Description
    SUM()Sum of arguments
    MEAN()Arithmetic mean
    MIN() / MAX()Minimum/Maximum value
    ROUND()Rounds a number
    INT()Returns integer part of a number
    ABS()Absolute value of the argument
    SQRT()Square root
    LOG() / LOG10()Returns the integer part of a number
    Functions in SAS Software

    Random Number Functions in SAS

    These functions generate random numbers.

    Random Number FunctionShort Description
    RANUNI()Generates random numbers from Uniform distribution
    RANNOR()Generates random numbers from a Normal distribution
    RANBIN()Generates random numbers from a Binomial distribution

    Financial Functions

    The following are important and useful financial calculations.

    Financial FunctionsShort Description
    IRR()Internal rate of return
    NPV()Returns Net Present Value
    PMT()Loan payment calculation

    Character Functions in SAS

    Manipulate and analyze text (string) data. These functions can also be classified as character-handling functions.

    Character FunctionsShort Description
    SUBSTR()Extracts a substring from an argument
    SCAN()Extracts a specified word from a string
    TRIM() / STRIP()Removes trailing/leading blanks from character expression
    UPCASE() / LOWCASE()Converts to uppercase/lowercase
    CATX()Concatenates strings with a delimiter
    INDEX()Finds the position of a
    COMPRESS()Removes specific characters from a string

    Statistical Functions

    The following are some important functions for the computation of descriptive statistical measures.

    Descriptive FunctionsShort Description
    MEAN(), MEDIAN(), MODE()Returns measures of central tendencies, mean, median, and mode of the data
    STD()Returns standard deviation
    VAR()Returns the variance
    N()Returns the count of non-missing values
    NMISS()Returns the count of missing values

    Date and Time Functions in SAS

    These functions handle SAS date, time, and datetime values.

    FunctionsShort Description
    TODAY() / DATE()Returns the current date
    MDY()Creates a date from month, day, year
    YEAR() / MONTH() / DAY()Extracts year/month/day
    INTCK()Computes intervals between dates
    INTNX()Increments a date by intervals
    DATEPART()Extracts the date from datetime
    TIMEPART()Extracts time from datetime

    Bitwise Functions

    The following functions perform bit-level operations.

    FunctionsShort Description
    BAND()Bitwise AND
    BOR()Bitwise OR
    BNOT()Bitwise NOT

    Array Functions

    The following functions work with arrays.

    FunctionsShort Description
    DIM()Returns the size of an array
    HBOUND() / LBOUND()Returns upper/ lower bounds of an array

    Special Functions

    Miscellaneous operations. These functions may be classified as conversion functions, too.

    FunctionsShort Description
    INPUT()Converts character to numeric/ date
    PUT()Converts value to formatted text
    LAG() / DIF()Access previous row values

    File and I/O Functions

    These functions handle file operations.

    FunctionsShort Description
    FILEEXIST()Checks if a file exists
    FEXIST()Checks if a fileref exists

    The SAS functions described above help us in data cleaning, transformation, and analysis in SAS programming/ Software.

    First Year (Intermediate) Mathematics Quiz