Introduction to R (see R-start.doc)
Be careful -- R is case sensitive.
Reading data (Creating a dataframe)
- mydata=read.csv(file=file.choose())
- mydata=read.table(file=file.choose()) #use to read in the txt files for the textbook exercises
- mydata=read.csv(P:/data/math/hartlaub/dataanalysis/Day1.csv") #reading csv file from Data Analysis folder into RStudio
Commands for dataframes
- mydata #shows the entire data set
- head(mydata) #shows the first 6 rows
- tail(mydata) #shows the last 6 rows
- str(mydata) #shows the variable names and types
- names(mydata) #shows the variable names
- rename(V1,Variable1, dataFrame=mydata) #renames V1 to Variable1; Note that epicalc package must be installed!
- ls() #shows a list of objects that are available
- attach(mydata) #attaches the dataframe to the R search path, which makes it easy to access variable names
Descriptive Statistics
- mean(x) #computes the mean of the variable x
- median(x) #computes the median of the variable x
- sd(x) #computes the standard deviation of the variable x
- IQR(x) #computer the IQR of the variable x
- summary(x) #computes the 5-number summary and the mean of the variable x
- t.test(x) #get a one sample t test
- t.test(x,y) #get a two sample t test
- t.test(x, y, paired=TRUE) #get a paired t test
- cor(x,y) #computes the correlation coefficient
- cor(mydata) #computes a correlation matrix
- cor.test(x,y) #test plus CI for rho
Graphical Displays
- hist(x) #creates a histogram for the variable x
- boxplot(x) # creates a boxplot for the variable x
- boxplot(y~x) # creates side-by-side boxplots
- stem(x) #creates a stem plot for the variable x
- plot(y~x) #creates a scatterplot of y versus x
- plot(mydata) #provides a scatterplot matrix
- abline(lm(y~x)) #adds regression line to plot
- lines(lowess(x,y)) # adds lowess line (x,y) to plot
- qplot(x, y) #creates a quick plot (ggplot2 package must be installed)
- ci.plot(regmodel) #creates a scatterplot with fitted line, confidence bands, and prediction bands (HH package must be installed)