I have been fascinated by players’ career batting and pitching trajectories over the years and Chapter 8 of our book talks about plotting and modeling these trajectories. This post describes a useful function to plot the career trajectory of a hitting rate for any player in MLB history.
The data frame
Batting in the
Lahman package contains the season batting data. There is some setup to get the data in a useful format.
Battingdata frame has separate hitting statistics for each team for a player in a given season. I use the
summarisefunction in the new
dplyrto collapse over the
stintvariable. (By the way,
dplyris much faster than
plyrfor this type of operation.)
- Using the
mergefunction, I add last name, first name, first year, last year, and birthyear variables to the
- A new plate appearances variable is defined — before I do this, I convert missing values of SF and SH to zero.
Now I’m ready to write a
plot.trajectory function. There will be four inputs:
- The name of the batter in quotes. (One can choose any batter in the Lahman database.)
- The numerator of the rate statistic we want to graph — for example, if we want to plot home run rates, then this numerator would be “HR”.
- The denominator of rate stat (typically “AB” or “PA”).
- In cases where there are multiple players with the same name like Ken Griffey or Tony Gwynn, the input
numgives the number of the player that you are interested in. (For example, if you want to plot the career trajectory of Junior Griffey, use num = 2.)
I use the
ggplot2 package to construct the plot and use a loess smoother to show the general pattern of the career trajectory.
Here is the code. First install the packages
devtools , and
ggplot2 . Then you can read in the setup code and the function by typing:
Let illustrate using this function to plot some trajectories. Mike Schmidt is one of my baseball heros — I can graph his home run trajectory by typing:
plot.trajectory("Mike Schmidt", "HR", "AB")
Instead suppose we look at Schmidt’s strikeout trajectory.
plot.trajectory("Mike Schmidt", "SO", "AB")
Ron Hunt is well-known as the “hit by pitch king”. How does Hunt’s HBP rate change over his career?
plot.trajectory("Ron Hunt", "HBP", "PA")
Anyway, it is fun to explore these trajectories for your favorite players. I also converted this function to a Shiny application that you can see by typing: