# Plotting Career Trajectories

I have been fascinated by players’ career batting and pitching trajectories over the years and Chapter 8 of our book talks about plotting and modeling these trajectories. This post describes a useful function to plot the career trajectory of a hitting rate for any player in MLB history.

The data frame ` Batting ` in the ` Lahman ` package contains the season batting data. There is some setup to get the data in a useful format.

• The ` Batting ` data frame has separate hitting statistics for each team for a player in a given season. I use the ` summarise ` function in the new ` dplyr ` to collapse over the ` stint ` variable. (By the way, ` dplyr ` is much faster than ` plyr ` for this type of operation.)
• Using the ` merge ` function, I add last name, first name, first year, last year, and birthyear variables to the ` Batting ` data frame.
• A new plate appearances variable is defined — before I do this, I convert missing values of SF and SH to zero.

Now I’m ready to write a ` plot.trajectory ` function. There will be four inputs:

• The name of the batter in quotes. (One can choose any batter in the Lahman database.)
• The numerator of the rate statistic we want to graph — for example, if we want to plot home run rates, then this numerator would be “HR”.
• The denominator of rate stat (typically “AB” or “PA”).
• In cases where there are multiple players with the same name like Ken Griffey or Tony Gwynn, the input ` num ` gives the number of the player that you are interested in. (For example, if you want to plot the career trajectory of Junior Griffey, use num = 2.)

I use the ` ggplot2 ` package to construct the plot and use a loess smoother to show the general pattern of the career trajectory.

Here is the code. First install the packages ` Lahman `, ` dplyr `, ` devtools `, and ` ggplot2 `. Then you can read in the setup code and the function by typing:

```library(devtools)
source_gist(9043429)
```

Let illustrate using this function to plot some trajectories. Mike Schmidt is one of my baseball heros — I can graph his home run trajectory by typing:

```plot.trajectory("Mike Schmidt", "HR", "AB")
```

Clearly, Schmidt peaked in home run hitting about age 30 (that’s when the Phillies won their first World Series).

Instead suppose we look at Schmidt’s strikeout trajectory.

```plot.trajectory("Mike Schmidt", "SO", "AB")
```

I believe Schmidt shortened his swing later in his career which led to a decrease in strikeout rates.

Ron Hunt is well-known as the “hit by pitch king”. How does Hunt’s HBP rate change over his career?

```plot.trajectory("Ron Hunt", "HBP", "PA")
```

I believe there is some ability aspect of getting hit by a pitch (it isn’t just luck driven), and Hunt seemed to peak in this ability around age 30.

Anyway, it is fun to explore these trajectories for your favorite players. I also converted this function to a Shiny application that you can see by typing:

```library(shiny)
shiny::runGist('9053425')
```