Data Analysis (Stat 206)
Brad Hartlaub
Spring 2021
R links
Math Science Skills Center
- Paige Bullock will be available to help you on Sunday and Thursday from 7:00 until 8:30 PM. Please see her email for detailed information about the Zoom link.
Daily Agendas
- February 1
- February 3
- February 5
- February 8
- February 10
- February 12
- February 15
- February 17
- February 19 - Quiz #1
- February 22
- February 24
- February 26
- March1
- March 3
- March 5 - Quiz #2
- March 8
- March 10
- March 12 - Unit A Project Presentations
- Andy Kelleher - Prediction problems with crude oil spills
- Irina Beshentseva and Andrew Nguyen- Predicting voter turnout in Ohio
- Sebastian Brylka and Tomas Munoz Reyes - Predicting soccer goals
- March 15 - Unit A Project Presentations
- Minh Pham and Dev Akre - Predicting MSRP of hybrid vehicles
- Colin Bowling, Takashi Kanazawa, and Cem Tener - Predicting monthly rent prices in Manhattan
- March 17 - Unit A Project Presentations
- Tara Ford and Mary Hester - Predicting hitting percentage for women's volleyball players
- Will Mohrmann - Predicting points per game in European soccer leagues
- Lucas Lu - Predicting the number of ratings for Apple IOS mobile applications
- March 19 - Unit A Project Presentations
- Maggie Bradley and Natalie Wilson - Predicting the price of best selling books
- James Butler - Predicting finishing position in the Premier League
- March 22 - Unit A Project Presentations
- Luke Kim - Predicting inequality using the Gini coefficient
- Junaid Yeasir Fahim and Harshal Rukhaiyar - Predicting life expectancy
- Jose Nino and Abdul Hafeez - Predicting household income in the U.S.
- March 24
- March 26
- March 29 - Exam #1
- April 2
- April 5
- April 7
- April 9
- April 12 - Quiz #3 on Chapter 5 - you may use one 8.5"x11" sheet of notes, our course web page, and our RMarkdown files during the quiz
- April 14 - Start Unit C - Logistic Regression
- April 16 - Unit B Project Presentations
- Sebastian Brylka and Will Mohrmann - Comparing NBA positions
- Minh Pham and Tomas Munoz Reyes - Storage temperature and time effects on Satsuma
- April 19 - Quiz #4 on Chapter 6
- April 21 - Unit B Project Presentations
- Andy Kelleher - Olive Oil
- Colin Bowling and Takashi Kanazawa - Price of electricity
- Maggie Bradley and Natalie Wilson - Gym crowdedness at UC Berkeley
- April 23 - Unit B Project Presentations
- Irina Beshentseva and Andrew Nguyen - Psychodemographic user profiles
- Luke Kim and James Butler - Household income
- April 26 - Unit B Project Presentations
- Abdul Hafeez and Harshal Rukhaiyar - Comparing the price of cars
- Dev Akre - Comparing airfare for U.S. carriers
- Lucas Lu - Video game sales in different regions
- April 28
- Jose Nino - Excess death rates in the U.S.
- Mary Hester and Tara Ford
- April 30
- May 3 - Exam #2
- May 5 - Lab time for final projects / individual meetings
- May 7 - Lab time for final projects / individual meetings
- May 10 - Individual meetings to discuss overall percentage
- May 17 - Final Projects (including data files, R code, and a pdf version of your paper) must be uploaded to your Google Drive folder by 1:30 pm
Homework Assignments
- Your solutions must be submitted electronically to your Google Drive folder. You must submit a PDF of your solutions. For example, the name of the file for the first homework assignment should be HW1-yourname.PDF
- HW #1 - due on Wednesday, February 10
- HW #2 - due on Wednesday, February 17
- HW #3 - due on Friday, February 26
- HW #4 - due on Friday, March 5
- Unit A Project Proposal - due on Monday, March 8 - In your Unit A project, you should compare and contrast at least three possible models for a response variable of interest. After you describe the problem of interest and your data, you should use the choose, fit, assess, and use steps to make appropriate inferences and comparisons. Your analysis should include at least one formal inference (hypothesis test) and appropriate confidence intervals or prediction intervals. If you have any questions, please don't hesitate to contact me. Your proposal should be sent to me via email and include a short statement about your problem of interest, a description of your data, and a proposed date for your 10-12 minute presentation to the class. Presentations will begin on Friday, March 12.
- HW #5 - due on Monday, March 22
- HW #6 - due on Monday, April 5 ( deadline extended to Wednesday, April 7)
- Unit B project proposal - due on Friday, April 9 - Your group presentation should compare and contrast ANOVA or ANCOVA models we have discussed in class to a real data set or data from an experiment of interest to you. After you describe the problem of interest and your data, you should use the choose, fit, assess, and use steps to make appropriate inferences and comparisons. Your analysis should include formal inferences (hypothesis tests) and appropriate multiple comparisons procedures. If you have any questions, please don't hesitate to contact me. Please send me your proposals via email and include the names of the people in your group and a preferred date for your presentation between April 12 and April 23.
- HW #7 - due on Monday, April 12 (extended to Wednesday, April 14)
- HW #8 - due on Wednesday, April 21
- HW #9 - suggested exercises (will not be collected)
Sample Exams
- See !Exam and Quiz Samples on our Google Drive folder.
Interesting Links
Data Sources