1/18/2023
Preview of what we will be doing this semester with R!
Types of Variables
- A quantitative variable is a variable that takes
on numerical values for which arithmetic makes sense.
- A categorical or qualitative variable is a variable
that records which category a person place or thing falls into.
Questions about data
- What variables are being measured?
- Are these variables appropriate for answering the
question(s) of interest?
- What are the units of measurement?
- How are the data recorded?
Graphical Displays
- Why are graphing techniques useful?
- Examine the overall shape of a distribution - (symmetric
or skewed?)
- Look for deviations from the overall shape - (unusual
observations, gaps, etc.)
- Locate the center of the distribution
Histograms
- Divide the range of the data into classes of equal
width.
- Count the number of observations in each class.
- Compute the relative frequency or percent for each
class.
- Erect bars over each class interval.
Getting started with R (I will demo the commands below, which you can find in the file ClassDataS23.R)
- Lauch RStudio
- Load our data from the class survey (ClassDataS23.csv)
- Method 1 - ClassDataS23=read.csv(file=file.choose())
- Method 2 (need the path) - ClassDataS23<-read.csv("G:/My Drive/Stat106-ElementsofStatistics--S2023/!Class-csv-Rscripts/ClassDataS23.csv")
- Some basic R
- str(ClassDataS23)
- head(ClassDataS23)
- tail(ClassDataS23)
- names(ClassDataS23)
- libarary(mosaic) -- a package that we will be using throughout the semester.
Graphical Displays with R (Use ClassDataS22.csv)
- Histograms - histogram(~Height, data=ClassDataS23)
- Boxplots - boxplot(ClassDataS23$Age), boxplot(Age~Year, data=ClassDataS23)
- Scatterplots - plot(Height~Footlength, data=ClassDataS23)
Suggested In Class Exercises (Time to explore R on your own or with a partner.)
- Construct appropriate graphs to visually summarize the information collected on the class data survey for the following variables:
- Year
- Height
- Handedness
- Age
- HometownSize
- PulseRate
- TextMessages
- Varsity
- CatsDogs
- ExerciseMinutes
- FootLength
- Use appropriate graphical displays and descriptive statistics to make appropriate comparisons between cat and dog lovers for different variables in the data set we collected during the first class.
- Use appropriate graphical displays and descriptive statistics to make appropriate comparisons between the class years for different variables in the data set we collected during the first class.
- Is the shape of the distribution for guesses of the length of the black string the same as the distribution of guesses for the length of the white string?
- The actual length of the white sting is 46". Is the overall distribution of guesses centered at this value?
- The actual length of the black string is 48". Is the overall distribution of guesses centered at the appropriate value?
Please read Section 1.3 for class on Friday.