# 9/26/2022

## Chapter 7 - Iteration - see Iteration.R

• Our goal is to automate iterative operations
• For example, which player had the most hits for each team in MLB?
• Which pitcher had the most strike outs for each team in MLB?
• Which student has the highest GPA for each team/club at Kenyon?
• Vectorized operations
• Many people use for () loops to iterate calculations.
• However, R is highly optimized for vector operations, and loops do not take advantage of this optimization.
• Try to avoid writing for() loops whenever possible with R.
• Many functions, e.g., exp(), will be applied to every element of a vector by default.
• Note that summary functions, like mean(), return only a single value.
• Different functions behave differently so take advantage of the function when it is vectorized!
• Using across() with dplyr functions can be extremely helpful
• The function across() applies operations programmatically.
• Using across() helps us avoid "magic numbers"/references that are often used in loops.
• The across() function provides an easy way to perform an operation on a set of variables without having to type or copy-and-paste the name of each variable.
• The map() family of functions
• Use map() to apply a function to each item in a list or vector or the columns of a data frame.
• map() is the main function from the purrr package.
• The map() function will always return a list.
• The map_dbl() function forces the output to be a vector of type double.
• The map_int() funtion forces the output to be a vector of integers.
• Iteration over a one dimensional vector
• Iterating a known function
• Example - the map_int() function can be used with nchar() to find the length of all names for the Angels in MLB - see Iterate.R
• The nchar() function can also be used directly because it is vectorized
• Iterating an arbitrary function
• You can apply any function, including user defined functions that you create yourself.
• Example - Top-5 seasons for MLB teams - see Iterate.R
• map_dfr() provides another way to collect the results into a data frame.
• Iteration over subgroups
• The group_modify() function in purrr allows you to apply an arbitrary function that returns a data frame to the groups of a data frame.
• You can use the group_by() function to define a grouping.
• Simulation
• Using distributions to understand randomness