Statistical Computing Activity #8
- Use PageRank, discussed in Chapter 20 of MDSR and presented by Claire and Kaitlyn, on a data set of your choice. One possibility is to rank the teams in your favorite sport by the win-loss record and scores, like Claire and Kaitlyn did for A10 teams in the NCAA during 1996. (ESPN.com, NFL.com, MLB.com, NHL.com, NBA.com and lots of other sites have team summaries that may be helpful.) Note: you do NOT have to use a sports examples. This idea can be used to rank stocks, airports, state performance, etc. In additon to providing the PageRank of each unit, use R to graph the network and vary the thickness or transparency of each edge using an appropriate characteristic.
- Apply the regression tree technique that Daniel discussed to a data set of interest to you. Your response variable should be quantitative and you should have at least three potential preditor variables in your dataframe. (See Chapter 8 of ISL(An Introduction to Statistical Learning) or https://en.wikipedia.org/wiki/Decision_tree_learning)
- Challenge (optional for extra credit) - Use the methods described in Section 20.2 to build a Hollywood network for movies of interest to you. You may use appropriae filters (decade, genre, etc.) to limit your database. Have fun!