Ryan Howard’s Platoon Effects

Last week, I compared the home run career trajectories of Ryan Howard with 10 similar players. But of course, there is more to offensive production than hitting home runs, and the Phillies are concerned about Howard’s ability (actually, his lack of ability) to hit left-handed pitching. From looking at Retrosheet play-by-play data, what can we learn about Howard’s platoon effects over his career?

A good measure of hitting performance at a single plate appearance is run value — this is defined to be the difference in runs potential (expected runs in the remainder of the inning) after and before the PA plus the number of runs scored in the play. We define a season platoon effect as (MEAN RUN VALUE AGAINST RIGHT-HANDED PITCHERS) – (MEAN RUN VALUE AGAINST LEFT-HANDED PITCHERS).

Here’s the procedure for computing these platoon splits:

  1. We collect the relevant Retrosheet play-by-play data for the seasons 2005 through 2013. The R functions parse.retrosheet2.pbp.R and compute.runs.expectancy.R read in the files from Retrosheet.org and compute the run values for all plays. (I described the use of this functions in a previous post.)
  2. These play-by-play files are large, so I just collect the batting plays for Howard for these nine seasons. I save the plays in a csv file called Ryan.csv that I store as a public Google spreadsheet here. (I’d encourage you to look at this file if you are not familiar with Retrosheet play-by-play files. There are many interesting variables collected.)
  3. If you want to run my R code, just download this csv file (make sure to download the file in csv format) to the current working directory of R. We will assume this data file is read into R into the data frame Ryan .

By use of the summarize function in the dplyr package, we compute the mean runs value and the number of plate appearances for each pitcher hand for each season — the output is stored in the data frame Runs.Season.Pitcher . Likewise, we do a similar summary of the runs values for Season and that is stored in the data frame Runs.Season .

Runs.Season.Pitcher <- summarize(group_by(Ryan, Season, PIT_HAND_CD),
Runs.Season <- summarize(group_by(Ryan, Season),
Runs.Season$PIT_HAND_CD <- "Overall"

Using the ggplot2 package, we plot the left and right pitcher hand mean runs values for each season. The middle curve is the overall mean run values. As expected, we see a steady decrease in Howard’s runs value. It is interesting that Howard did well against lefties in his breakout year (2006). Howard did poorly against lefties in 2012, but he also did not do well against right-handed pitching.

To focus on the platoon effects, we first merge horizontally the data frames for right and left-handed pitching. Then we use ggplot2 to graph the difference in mean run values against season.

Merged.Runs <- merge(subset(Runs.Season.Pitcher, PIT_HAND_CD=="R"),
            subset(Runs.Season.Pitcher, PIT_HAND_CD=="L"),

Although there is a sizable season-to-season variability in the size of the platoon effect, it seems pretty constant about 0.1 and I would anticipate it to remain about 0.1 in 2014. Actually, I think the Phillies should hope that Howard’s performance against righties would improve next season.

Okay, we’ve demonstrated that Howard has had a large platoon effect over his career. Will this impact how the Phillies play him this season? Let’s look at past seasons and compute the fraction of PA’s that Howard has hit against right-handers.

Merged.Runs$Fraction.Right <- with(Merged.Runs,
                                   round(PA.x / (PA.x + PA.y), 2))
Merged.Runs[, c("Season", "PA.x", "PA.y", "Fraction.Right")]
  Season PA.x PA.y Fraction.Right
1   2005  285   63           0.82
2   2006  479  225           0.68
3   2007  402  246           0.62
4   2008  435  265           0.62
5   2009  451  252           0.64
6   2010  404  216           0.65
7   2011  459  185           0.71
8   2012  186  106           0.64
9   2013  230   87           0.73

In the seasons 2007 through 2010, about 62-65% of Howard’s PA’s were against righties which indicates he was a full-time player. In 2005 and in recent years, this percentage has shown an increase, indicating that the Phillies were resting him against left-handers. It will be interesting to see how the new Phillies manager plays Howard in the 2014 season.

One interesting thing about Ryan Howard is that teams typically employ a fielding shift where they have three infielders play between 1st and 2nd bases. I don’t know if data on fielder location is publicly available, but it would be interesting to explore the effectiveness of this defensive strategy.

By the way, if you have downloaded the Ryan.csv play-by-play data and the dataset rests in the current working directory, then you can replicate this R work by typing


3 responses

  1. Great post!! How can check the code of ggplot2?


  2. Marti, sorry I didn’t mention it. Here’s the listing of the code:

  3. Jim, Thank you so much. Great post and job!!!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: