8/30/2023
Chapter 0
- Introduction to a randomization/permutation test
- Open Day1-F2023.csv (data from the first day of a data analysis course) The variables are:
- Restingpulse
- Activepulse - pulse after 1 minute of exercise
- Varsity - yes=varsity athlete, no=not a varisty athlete
- DistanceHome - distance traveled from home to campus
- Is the active pulse different for athletes and nonathletes?
- To answer this question we will consider a completely different type of inference, based on randomization.
- Simple demonstration of the idea with cards
- Open StatKey and click Test for Difference in Means
- Upload the file Day1-F2023.csv
- Generate 1 sample
- Generate 1000 samples
- Make Inference
- Compare inference based on this simulation technique with the inference from t.test (see Day1.R)
Continue Introduction to R
- Importing scripts - open Day1.R
- Subsetting in R - try the subset command
- athlete<-subset(mydata, Varsity=='yes')
- Try to access and save a file to your HW folder GoogleDrive\:Stat206-DataAnalysis-F2023\yourname
Suggested Class Exercises
- Compare the resting pulses of athletes and nonathletes
- Nicky Forsyth
- Drake Lewis
- Compare the active pulses of athletes and nonathletes
- Kate Lengel
- Alex Rameriz
- Compare the distance traveled to Gambier for athletes and nonathletes
- Drake Lewis
- Jason Harmer
- Create a new varaiable, say Diff, to measure the increase in pulse after one minute of exercise. Do the heart rates for nonathletes increase more than the heart rates for athletes after one minute of exercise?
- Henry Rodrigues
- Kate Lengel
- Exercise 0.26 on p. 17
- Elts Maricq or Nick Nelson
- Henry Rodrigues
- Exercise 0.23 with a randomization/permutation test
- Elts Maricq or Nick Nelson
- Benny Abeysekera
Chapter 1
- Open Stat206-DataAnalysis-F2023\2ePowerPoint\Sec1.1R.pptx on our Google Drive folder
- Review simple linear regression
Please read Sections 1.2 and 1.3 for class on Monday. We will have our first problem session on Friday for the suggested class exercises above.