Statistical Computing with R (Stat 226)
Brad Hartlaub
Fall 2020
- You may find the spuRs package helpful for some exercises and case studies.
MSSC Tutor - Steven Lucas
- Steven will be available on Sunday, Tuesday, and Thursday evening from 8:00 until 9:00 to help you with the material for this course.
R links
Daily Agendas
- August 31
- September 2
- September 4 - our first problem session
- September 7
- September 9
- September 11 - Problem Session
- September 14
- September 16
- September 18 - Problem Session
- September 21 - Problem Session and lab time in breakout rooms for Probability Project
- September 23 - Small Group Project #1 Presentations
- Shanti Silver and Ryan Schultz
- Michael Morgan
- Josh Koretz, Sam Canseco, and Daniel Wu
- September 25 - Small Group Project #1 Presentations
- Isabella Femia and Josh Katz
- Rebecca Lawson
- Claire Murray and Kaitlyn Griffith
- Olivia Dion and Fiona Dunn
- September 28 - Small Group Project #1 Presentations
- Sarah Pazen and Andrew Nguyen
- Ken Wu
- Becca Elbert
- Amir Brivaniou
- September 30 - Small Group Project #1 Presentations
- Meg Ellingwood
- A. Shaikh
- Bella Creel
- October 2 - Please watch BootstrappingInto.mp4, BootstapSampling.mp4, and BootstrapCIs.mp4 before class.
- October 5 - Problem session for HW #4 (Chapters 5 and 17). Please read Chapter 18 for class on Wednesday.
- October 7 - Complete problem session for Chapter 17. Please compete your reading of Chapter 18 for class on Friday.
- October 9 - Methods for deriving estimators and review of CIs
- October 12 - Small Group Project #2 Breakout discussions
- October 14 - Small Group Project #2 Presentations
- Meg Ellingwood
- Shanti Silver and Ryan Schultz
- October 16 - Small Group Project #2 Presentations
- Fiona Dunn and Oliva Dion
- Amir Brivanlou
- Claire Murray and Kaitlyn Griffith
- Sam Canseco and Michael Morgan
- October 19 - Small Group Project #2 Presentations
- Bella Creel and Andrew Nguyen
- Ken Wu
- Becca Elbert
- Josh Koretz and Daniel Wu
- October 21 - Small Group Project #2 Presentations
- A. Shaikh and Sarah Pazen
- Isabella Femia and Josh Katz
- Rebecca Lawson
- October 23 - Problem Session
- October 26
- October 28
- October 30 - Problem Session
- November 2
- November 4
- November 6 - classes are cancelled (see email message from President Decatur)
- November 9 - Problem Session
- November 11
- November 13
- November 16
- November 18
- Claire Murray and Kaitlyn Griffith - Networks (Chapter 20 of MDSR)
- Daniel Wu - Regression trees
- November 20
- Isabella Femia and Josh Katz - Markov Chains in MLB
- Sam Canseco, Josh Koretz, and Michael Morgan - Monte Carlo power study for tests of normality
- November 23
- Andrew Nguyen and Ken Wu - Monte Carlo power study for t-test, permutation test, and bootstrapping when comparing two means
- Rebecca Lawson - Markov Chains in simulating text
- Fiona Dunn and Olivia Dion - Working with Geospatial Data (Chapter 17 in MDSR)
- November 30
- Bella Creel - Text as data (Chapter 19 of MDSR)
- Shanti Silver and Ryan Schultz - Data Wrangling (Chapters 4, 5, and 6 in MDSR)
- Breakout room discussions about final project proposals
- December 2
- A. Shaikh and Sarah Pazen - Predictive Modeling (Chapter 10 in MDSR)
- Claire Murray - PS on glimpse function
- Ken Wu - PS on lubridate function
- December 4
- Amir Brivanlou - Markov Chains in SIR model
- Bella Creel - PS presentations related to replicate and ggplot2, especially colorbrewer options
- Olivia Dion - PS presentations with mapping functions
- Kaitlyn Griffith - PS presentations
- December 7
- Samuel Canseco - PS presentations on stamenmap and ggmap
- Becca Elbert
- Individual conversations about final projects
- December 9 - Individual conversations about overall percentage and final projects
- December 13 - Your final papers (10 - 15 pages) are due before 4:30 PM
Homework Assignments
- Your solutions must be submitted electronically to your Google Drive folder. You may use any software that you want, but please submit a PDF file with your written solutions. For example, the name of the file for the first homework assignment should be HW1-yourname.PDF.
- Activity #1 - due on Wednesday, September 2
- HW #1 - due on Wednesday, September 9
- HW #2 - due on Monday, September 14
- Activity #2 - due on Wednesday, September 16
- HW #3 - due on Wednesday, September 23
- Activity #3 - due on Wednesday, September 23
- Small Group Project #1 - presentations will begin on Wednesday, September 23 - PPT or PDF is due on the day of your presentation
- Create an R script or markdown file that applies at least three probability distributions to pratical problems of interest to you. You may work by yourself or with a partner on this project. Ideally, your solution will have a theoretical and a simulated solution. However, if the theoretical solution is unknown (or unknown to you) then you can stick with simulation. You must include at least one continuous distribution and at least one discrete distribution during your presentation. Presentations to the class should last from 7 to 10 minutes.
- HW #4 - due on Wednesday, October 7
- Small Group Project #2 - presentations will begin on Monday, October 12 - files (PPT or PDF and your R code) are due on the day of your presentation
- Create an R script or markdown file that compares at least two estimators for an estimtation problem of interest to you. You may work by yourself or with a partner on this project. At least one of your estimators should be based on bootstrapped ssamples. In addition to point estimates, you should also make appropriate use of confidence intervals. The duration of your presentation should be 10 to 12 minutes. The question and answer session after each presentation will be 2 to 3 minutes long.
- Activity #4 - due on Friday, October 23
- HW #5 - due on Monday, October 26
- Activity #5 - due on Monday, November 2
- HW #6 - due on Monday, November 2
- Activity 6 - due on Monday, November 9 - Ken's solutions on RPubs
- HW #7 - due on Wednesday, November 11
- Small Group Project #3 - presentations will begin on Monday, November 16 - your files (PPT or PDF and your R code) are due on the day of your presentation
- Option 1 - A Monte Carlo power study of at least two competing test procedures
- Option 2 - Applications of Markov Chains (see sample papers in !Markov Chain Articles)
- Option 3 - Introduce new material from Modern Data Science with R
- You may work by yourself or with a partner on this project. Your presentation should be approximately 15 minutes, and the question and answer session after each presentation will be no longer than 5 minutes.
- Activity 7 - due on Monday, November 23
- Activity 8 - due on Monday, November 30
- Summary of your proposed simulation for your final project - see Chapters 23 and 24 and !Final Project Samples for examples of reasonable projects - due on Monday, November 30
- Activity 9 - optional (replaces lowest score on previous 8 activities) - due on Friday, December 5
Interesting Links