Data Analysis (Stat 206)
Brad Hartlaub
Spring 2020
R links
Math Science Skills Center
- Bak Baitan will be available to help you on Tuesday from 9:00 until 10:00 pm and Thursday from 8:00 until 10:00 pm.
Daily Agendas
- January 13
- January 15
- January 17
- January 20
- January 22
- January 24
- January 27
- January 29
- January 31 - Quiz #1
- February 3
- February 5
- February 7
- February 10
- February 12
- February 14 - Quiz #2
- February 17
- February 19
- February 21 - Lab time for Unit A projects and questions on class activities
- February 24 - Unit A Project Poster Session
- Predicting NHL whole-season performance - Meg Ellingwood
- Survey of STEM-graduate salary data- Alex Felleson and Scott Shaffer
- Predicting winning margins in presidential elections - Jake Ritz and Zach Grumbach
- The effect of pitching on wins - Grace Finn and Michael Gleason
- Predicting winning Olympic times using weight and BMI - Abby McCarty, Elise Hockanson, Hans Schwarz
- February 26 - Unit A Project Poster Session
- Exploring the gender wage gap with ACS Census data- Isabella Femia and Ruth Cohen
- Predicting HR and SLG - Drew Grier and Ben Czech
- Predicting Ohio transportation spending per capita - Hayden Toftner
- Predicting growth rates of real GDP per capita - Mo Elhabashy, Ken Wu, and Jorge Dumenigo
- Predicting NFL win percentage based on aveerage annual values of players - Jimmy Lane and Michael Morgan
- What factors best predict hate crimes? - Rachel Contri
- February 28 - Exam #1 on Unit A (Covers Chapters 0-4, open book, open notes, access to R, access to course web page, and I write the code.)
- March 1 - March 22 - Extended spring break because of Coronavirus
- March 23
- March 25
- March 27
- March 30
- April 1
- April 3
- April 6
- April 8
- April 10 - Questions on plans for logistic regression and time series (Units C and D) - Class time for Unit B Projects in Breakout Rooms
- April 13
- Ben Czech and Drew Grier - Linking MLB payroll to team performance
- Mike Gleason, Grace Finn, and Jack Seasholtz - MLB team perfromance by monetary variables
- Meg Ellingwod - Columbus Blue Jackets perfromance over the course of the season
- April 15
- Mayo Amorello, James Lane, and Michael Morgan - Cars 4: The race for knowledge
- Jake Ritz and Zach Grumbach - Modeling anger expression
- April 17
- Ruth Cohen, Isabella Femia, and Scott Shaffer - An analysis of mortality rate for COVID-19
- Hayden Toftner - ANOVA and ANCOVA models for GSS
- April 20
- Abby McCarty, Elise Hockanson, Hans Schwarz - The ULTIMATE chocolate chip cookie
- Alex Felleson and Rachel Contri - Cigarette smoking habits
- Jorge Dumenigo, Mo Elhabashy, and Ken Wu - Cardiovascular Disease
- April 22 - Group discussions on Unit C/D projects. I will set up breakout rooms with Zoom.
- April 24
- Isabella Femia, Ruth Cohen, and Scott Shaffer - Recidivism rates for felons in NY
- April 27
- Ben Czech and Drew Grier - Logistic regression models for MLB
- Jake Ritz and Zach Grumbach - Predicting voting intent
- Meg Ellingwood - Game outcomes in the NHL
- April 29
- Hans Schwarz and Jimmy Lane - COVID-19 time series analysis
- Mayo Amorello and Michael Morgan - Economic Indicators
- Abby McCarty and Elise Hockanson - Gender Wage Gap: A time series analysis
- May 1
- Alex Felleson and Rachel Contri - UK car accident trends
- Hayden Toftner - Phone usage time series models
- Grace Finn, Mike Gleason, and Jack Seasholtz - Restaurant Sales
- May 4- individual meetings about final project and overall class standing
- May 6 - individual meetings about final project
- Jorge Dumenigo and Ken Wu - Time series models for stocks
- Mo Elhabashy - Time series models
- May 8 - individual meetings about final project
- May 15 before 11:30 AM - Final projects are due.
Homework Assignments
- Your solutions must be submitted electronically to your Google Drive folder. You must submit a PDF of your solutions. For example, the name of the file for the first homework assignment should be HW1-yourname.PDF
- HW #1 - due on Wednesday, January 22
- HW #2 - due on Wednesday, January 29
- HW #3 - due on Wednesday, February 5
- HW #4 - due on Wednesday, February 12
- Unit A project proposal - due on Wednesday, February 19 - Your poster should compare and contrast at least three possible models for a response variable of interest. After you describe the problem of interest and your data, you should use the choose, fit, assess, and use steps to make appropriate inferences and comparisons. Your analysis should include at least one formal inference (hypothesis test) and appropriate confidence intervals or prediction intervals. If you have any questions, please don't hesitate to contact me.
- HW #5 - due on Wednesday, February 26
- HW #6 - due on Wednesday, April 1
- Unit B project proposal - due on Monday, April 6 - Your group presentation should compare and contrast ANOVA or ANCOVA models we have discussed in class to a real data set or data from an experiment of interest to you. After you describe the problem of interest and your data, you should use the choose, fit, assess, and use steps to make appropriate inferences and comparisons. Your analysis should include formal inferences (hypothesis tests) and appropriate multiple comparisons procedures. If you have any questions, please don't hesitate to contact me. Please send me your proposals before 9:00 (Eastern time) and include the names of the people in your group and a preferred date for your presentation between April 8 and April 20.
- HW #7 - due on Wednesday, April 8
- HW #8 - due on Wednesday, April 15
- Unit C or D project proposal - due on or before Monday, April 20 - Your group presentation should apply logistic regression models or time series models to a real data set of interest to you. After you describe the problem of interest and your data, you should use the choose, fit, assess, and use steps to make appropriate inferences and predictions. Your analysis should include formal inferences (hypothesis tests and confidence intervals). I would be happy to help as you move forward with this last group project of the semester. Please send me your proposals before 9:00 (Eastern time) and include the names of the people in your group and a preferred date for your presentation between April 22 and May1.
Sample Exams
- See p:\data\math\hartlaub\dataanalysis
- The files have also been copied to our Google Drive folder..
Interesting Links
Data Sources