SAS Commands and Procedures


In the documentation below, each SAS command will be briefly described and then the syntax of the command will be specified.  The necessary SAS command will always appear in BOLD CAPITAL LETTERS, required information that needs to be specified by the programmer (you) will appear in italics, and optional information will be enclosed in <...>.

ANOVA PROCEDURE    Performs analysis of variance for balanced data from a wide variety of experimental designs.

PROC ANOVA <option list>;
  CLASS variables;
  MODEL dependent varible(s) = effects / options;
  BY variables;
  MEANS effects / options;
  REPEATED factor specification / options;
  TEST <H = effects> E = effect;

CHART PROCEDURE    Produces vertical and horizontal bar charts, histograms, block charts, pie charts, and star charts.

PROC CHART <option list>;
  BY variable-list;
  VBAR variable-list / options;
  HBAR variable-list / options;
  BLOCK variable-list / options;
  PIE variable-list / options;
  STAR variable-list / options;
CORRELATION PROCEDURE    Computes Pearson correlation coefficients and nonparametric measures of association.
PROC CORR <option list>;
  BY variable-1 <variable-2> ... <variable-n>;
  VAR variable-1 variable 2 <variable-3> ... <variable-n>;
DATA COMMAND    Begins a data step to create a SAS data set.
DATA data-set-name;
FILENAME COMMAND    Associates a SAS file reference with an external file.
FILENAME fileref 'external filename';
INFILE COMMAND    Identifies an external file to read with an INPUT statement.
INFILE fileref;
INPUT COMMAND    Describes the arrangement of values in an input record and assigns input values to corresponding SAS variables.
INPUT variable-1 <variable-2> ... <variable-n>;

GLM PROCEDURE    Uses the method of least squares to fit general linear models.

PROC GLM <option list>;
  CLASS variables;
  MODEL dependent varible(s) = independent variables / options;
  BY variables;
  MEANS effects / options;
  OUTPUT OUT = SAS data-set keyword = names;

GPLOT PROCEDURE    Plots the values of two or more variables on a set of coordinate axes (X and Y).

PROC GPLOT <options>;
  PLOT plot-requests / options;
PRINT PROCEDURE    Prints the obseravations in a SAS data set.
PROC PRINT <option list>;
  VAR variable-list;
RANK PROCEDURE    Computes ranks for one or more numerical variables and outputs the ranks to a new SAS data set.
PROC RANK <options>;   NORMAL=BLOM    Computes normal scores for assessing normality.
  BY  variable-1 <variable-2> ... <variable-n>;
  VAR data-set-variables;
  RANKS new variables;

REGRESSION PROCEDURE    Provides a general-purpose procedure for regression.

PROC REG <options>;    OUTEST=data-set-name    outputs a data set that contains parameter estimates and other model fit statistics
  MODEL dependent variable(s) = explanatory variable(s) </options>;

Model Statement Options
ALPHA=
sets significance value for confidence and prediction intervals and tests
CLB
computes confidence limits for the parameter estimates
CLI
computes confidence limits for an individual predicted value
CLM
computes confidence limits for the expected value of the dependent variable
I
displays inverse of sums of squares and crossproducts
NOINT
fits a model without an intercept
P
computes predicted values
PRESS
outputs the PRESS statistic to the OUTEST= data set.
PCORR1
displays squared partial correlation coefficients using Type I sums of squares
PCORR2
displays squared partial correlation coefficients using Type II sums of squares
R
produces analysis of residuals
SELECTION=
specifies model selection method (FORWARD, BACKWARD, STEPWISE, MAXR, MINR, RSQUARE, ADJRSQ, or CP)
SLE=
sets criterion for entry into model
SLS=
sets criterion for staying in model
SS1
displays the sequential sum of squares
SS2
displays the partial sum of squares
XPX
displays sums-of-squares and crossproducts matrix

  BY variable(s);
  ID variable(s);
  OUTPUT <OUT=SAS-data-set> keyword-1=name1 ... <keyword-n=name-n>;

Keywords for Output
COOKD = name
Cook's D influence statistic
DFFITS = name
standard influence of observation on predicted value
H = name
leverage
LCL = name
lower bound of a confidence interval for an individual prediction
LCLM = name
lower bound of a confidence interval for the expected value (mean) of the dependent variable
PREDICTED (or P) = name
predicted values
PRESS = name
ith residual divided by (1-h), where h is the leverage and the model has been refit without the ith observation
RESIDUAL (or R) = name
residuals, caclulated as observed-predicted
RSTUDENT = name
a studentized residual with the current observation deleted
STDI = name
standard error of the individual predicted value
STDP = name
standard error of the mean predicted value
STDR = name
standard error of the residual
STUDENT = name
studentized residuals, residuals divided by their standard errors
UCL = name
upper bound of a confidence interval for an individual prediction
UCLM = name
upper bound of a confidence interval for the expected value (mean) of the dependent variable

  PLOT <y-variable*x-variable>;
  RESTRICT equation1 <equation2> ... <equationn>;

SORT PROCEDURE    Sorts observations in a SAS data set by one or more variables.

PROC SORT <option list>;
  BY variable-list;

TTEST PROCEDURE    Performs t tests.

PROC TTEST <options>;
  CLASS variables;
  BY variables;
  VAR variables;

UNIVARIATE PROCEDURE    Produces simple descriptive statistics.

PROC UNIVARIATE <option list>;
  VAR variable-list;
  BY variable-list;