SAS Commands and Procedures

In the documentation below, each SAS command will be briefly described and then the syntax of the command will be specified. The necessary SAS command will always appear in BOLD CAPITAL LETTERS, required information that needs to be specified by the programmer (you) will appear in italics, and optional information will be enclosed in <...>.

CHART PROCEDURE Produces vertical and horizontal bar charts, histograms, block charts, pie charts, and star charts.

PROC CHART <option list>;
BY variable-list;
VBAR variable-list / options;
HBAR variable-list / options;
BLOCK variable-list / options;
PIE variable-list / options;
STAR variable-list / options;

CORRELATION PROCEDURE Computes Pearson correlation coefficients and nonparametric measures of association.

PROC CORR <option list>;
BY variable-1 <variable-2> ... <variable-n>;
VAR variable-1 variable 2 <variable-3> ... <variable-n>;

DATA COMMAND Begins a data step to create a SAS data set.

DATA data-set-name;

FILENAME COMMAND Associates a SAS file reference with an external file.

FILENAME fileref 'external filename';

INFILE COMMAND Identifies an external file to read with an INPUT statement.

INFILE fileref;

INPUT COMMAND Describes the arrangement of values in an input record and assigns input values to corresponding SAS variables.

INPUT variable-1 <variable-2> ... <variable-n>;

GPLOT PROCEDURE Plots the values of two or more variables on a set of coordinate axes (X and Y).

PROC GPLOT <options>;
PLOT plot-requests / options;

PRINT PROCEDURE Prints the obseravations in a SAS data set.

PROC PRINT <option list>;
VAR variable-list;

RANK PROCEDURE Computes ranks for one or more numerical variables and outputs the ranks to a new SAS data set.

PROC RANK <options>; NORMAL=BLOM Computes normal scores for assessing normality.
BY variable-1 <variable-2> ... <variable-n>;
VAR data-set-variables;
RANKS new variables;

REGRESSION PROCEDURE Provides a general-purpose procedure for regression.

PROC REG <options>; OUTEST=data-set-name outputs a data set that contains parameter estimates and other model fit statistics
MODEL dependent variable(s) = explanatory variable(s) </options>;

Model Statement Options

ALPHA=
sets significance value for confidence and prediction intervals and tests

CLB
computes confidence limits for the parameter estimates

CLI
computes confidence limits for an individual predicted value

CLM
computes confidence limits for the expected value of the dependent variable

I
displays inverse of sums of squares and crossproducts

NOINT
fits a model without an intercept

P
computes predicted values

PRESS
outputs the PRESS statistic to the OUTEST= data set.

PCORR1
displays squared partial correlation coefficients using Type I sums of squares

PCORR2
displays squared partial correlation coefficients using Type II sums of squares

R
produces analysis of residuals

SELECTION=
specifies model selection method (FORWARD, BACKWARD, STEPWISE, MAXR, MINR, RSQUARE, ADJRSQ, or CP)

SLE=
sets criterion for entry into model

SLS=
sets criterion for staying in model

SS1
displays the sequential sum of squares

SS2
displays the partial sum of squares

XPX
displays sums-of-squares and crossproducts matrix

BY variable(s);
ID variable(s);
OUTPUT <OUT=SAS-data-set> keyword-1=name1 ... <keyword-n=name-n>;

Keywords for Output

COOKD = name
Cook's D influence statistic

DFFITS = name
standard influence of observation on predicted value

H = name
leverage

LCL = name
lower bound of a confidence interval for an individual prediction

LCLM = name
lower bound of a confidence interval for the expected value (mean) of the dependent variable

PREDICTED (or P) = name
predicted values

PRESS = name
ith residual divided by (1-h), where h is the leverage and the model has been refit without the ith observation

RESIDUAL (or R) = name
residuals, caclulated as observed-predicted

RSTUDENT = name
a studentized residual with the current observation deleted

STDI = name
standard error of the individual predicted value

STDP = name
standard error of the mean predicted value

STDR = name
standard error of the residual

STUDENT = name
studentized residuals, residuals divided by their standard errors

UCL = name
upper bound of a confidence interval for an individual prediction

UCLM = name
upper bound of a confidence interval for the expected value (mean) of the dependent variable

PLOT <y-variable*x-variable>;
RESTRICT equation1 <equation2> ... <equationn>;

SORT PROCEDURE Sorts observations in a SAS data set by one or more variables.

PROC SORT <option list>;
BY variable-list;

UNIVARIATE PROCEDURE Produces simple descriptive statistics.

PROC UNIVARIATE <option list>;
VAR variable-list;
BY variable-list;

Model Statement Options
ALPHA=	sets significance value for confidence and prediction intervals and tests
CLB	computes confidence limits for the parameter estimates
CLI	computes confidence limits for an individual predicted value
CLM	computes confidence limits for the expected value of the dependent variable
I	displays inverse of sums of squares and crossproducts
NOINT	fits a model without an intercept
P	computes predicted values
PRESS	outputs the PRESS statistic to the OUTEST= data set.
PCORR1	displays squared partial correlation coefficients using Type I sums of squares
PCORR2	displays squared partial correlation coefficients using Type II sums of squares
R	produces analysis of residuals
SELECTION=	specifies model selection method (FORWARD, BACKWARD, STEPWISE, MAXR, MINR, RSQUARE, ADJRSQ, or CP)
SLE=	sets criterion for entry into model
SLS=	sets criterion for staying in model
SS1	displays the sequential sum of squares
SS2	displays the partial sum of squares
XPX	displays sums-of-squares and crossproducts matrix

Keywords for Output
COOKD = name	Cook's D influence statistic
DFFITS = name	standard influence of observation on predicted value
H = name	leverage
LCL = name	lower bound of a confidence interval for an individual prediction
LCLM = name	lower bound of a confidence interval for the expected value (mean) of the dependent variable
PREDICTED (or P) = name	predicted values
PRESS = name	ith residual divided by (1-h), where h is the leverage and the model has been refit without the ith observation
RESIDUAL (or R) = name	residuals, caclulated as observed-predicted
RSTUDENT = name	a studentized residual with the current observation deleted
STDI = name	standard error of the individual predicted value
STDP = name	standard error of the mean predicted value
STDR = name	standard error of the residual
STUDENT = name	studentized residuals, residuals divided by their standard errors
UCL = name	upper bound of a confidence interval for an individual prediction
UCLM = name	upper bound of a confidence interval for the expected value (mean) of the dependent variable