- January 16
- January 23
- January 30
- February 6
- February 13 - work with a partner on time series models
- February 20
- February 27 - Midterm project updates (approximately 10 min. per person)
- Please post your article, data set, R code, and supporting materials in a folder under Google Drive\!SportsAnalytics-S2018\!Midterm Projects
- NCAA selection projects are due before the show begins on March 11
- March 20
- Q&A Session with Dennis Lock, Director of Analytics, Miami Dolphins (Meet in Bailey House Conference Room)
You may use R scripts or R notebooks for your analysis, but your position paper should be in the form of a one page executive summary. You should appropriately cite any sources that you use for data or to support your rationale. All articles and position papers should start with your name and then the title. For example, hartlaub-CTE.pdf or hartlaub-PP1.pdf.
- #1 Due on Tuesday, January 23
- In a tweet President Trump claimed that "NFL attendance and ratings are WAY DOWN." For your first position paper, collect and analyze data to address the claim by others that the decreases are simply due to chance. Is the decrease simply chance variation or a significant decrease?
- Post at least one article on CTE, concussions, or head injuries in sports for all of us to read for class next Tuesday. Your pdf version of the article should be copied into a folder on Google drive.
- Sunday Swoon?
- #2 Due on Tuesday, January 30
- Who is the best? Trying to decide the best team or athlete leads to interesting debates and often controversies. However, there are interesting methods for comparing teams and players from different time periods. For example, FiveThirtyEight published an article titled Vegas Has The Best Expansion Team in the History of Pro Sports, and It's Not Close that clearly illustrated how z-scores can be helpful. Make a data driven decision regarding the best swimmer: option 1 is Janet Evans versus Katie Ledecky and option 2 is Michael Phelps versus Mark Spitz.
- Post at least one article into the Google drive folder Elo on how the Elo rating system has been used to rate players or teams or leagues. Your pdf version of the article should be posted before the end of the day on Sunday, January 28.
- #3 Due on Tuesday, February 6
- Has the gender gap narrowed over the last two decades? This week we will dive into the controversial area of gender differences in performance and pay. As the Olympcis are about to begin, estimate the improvement in times for male versus female skaters, skiers, bobsledders, etc? Use your model to predict the winning time for at least one event in the 2018 Winter Olympics. Is the rate of improvement the same for men and women? Expanding beyond Olympic competion, are there sports where you believe that women will outperform men? Collect appropropriate data to make data driven decisions for sports of your choice. Provide your R code and 1 page executive summary in the folder !Position Paper#3.
- Post at least one article into the Google drive folder on wage, performance, or rating differences for athletes or coaches. Your pdf version of the article should be posted before the end of the day on Sunday, February 4 so that we can all read the articles before class.
- What if men and women skied against each other in the Olympics?
- #4 Due on Tuesday, February 13
- This week we begin looking at performance over time. In particular, I would like you to see if you can find evidence to support or refute the claim that player and team performance tends to follow a quadratic trend over time. For example, some experts believe that hitters will improve until they hit peak performance (hopefully near the end of a long career) and then gradually fall back to where they started. When considering particular players, you need to make sure that the career lengths are long enough to provide reasonable estimates of a career profile. Are the quadratic models better in some sports than others? Your analysis should include player profiles from at least two different sports. Post your R code and a 1 page executive summary to !Position Papers>Positon Paper #4.
- #5 Due on Tuesday, February 20
- Can ARIMA time series model be used to explain the variability in player or team performance from game-to-game and to forecast performance? For example, look at the points scored by your favorite NBA player over time. Create and comment on ACF plots. Does an autoregressive model provide a good fit? Does a moving average model provide a good fit? Is the player's performance stationary over time or does it drift over time? Now consider other performance statistics and other players, perhaps in different sports. Find a player and performance statistic where an autoregressive model provides a good fit and use the model to forecast performance. Find a different player and performance statistics where a moving average model provides a good fit and use your model to make forecasts. Ideally you will pick settings where you can "check your model" by seeing if the actual value falls into your prediction interval. (For this assignment only, you may switch over to Minitab if you wish. I'm curious to hear your toughts about comparing and contrasting Minitab and R for time series.)