Last week, I compared the home run career trajectories of Ryan Howard with 10 similar players. But of course, there is more to offensive production than hitting home runs, and the Phillies are concerned about Howard’s ability (actually, his lack of ability) to hit left-handed pitching. From looking at Retrosheet play-by-play data, what can we learn about Howard’s platoon effects over his career?

A good measure of hitting performance at a single plate appearance is run value — this is defined to be the difference in runs potential (expected runs in the remainder of the inning) after and before the PA plus the number of runs scored in the play. We define a season platoon effect as (MEAN RUN VALUE AGAINST RIGHT-HANDED PITCHERS) – (MEAN RUN VALUE AGAINST LEFT-HANDED PITCHERS).

Here’s the procedure for computing these platoon splits:

- We collect the relevant Retrosheet play-by-play data for the seasons 2005 through 2013. The R functions
`parse.retrosheet2.pbp.R`

and`compute.runs.expectancy.R`

read in the files from Retrosheet.org and compute the run values for all plays. (I described the use of this functions in a previous post.) - These play-by-play files are large, so I just collect the batting plays for Howard for these nine seasons. I save the plays in a csv file called
`Ryan.csv`

that I store as a public Google spreadsheet here. (I’d encourage you to look at this file if you are not familiar with Retrosheet play-by-play files. There are many interesting variables collected.) - If you want to run my R code, just download this csv file (make sure to download the file in csv format) to the current working directory of R. We will assume this data file is read into R into the data frame
`Ryan`

.

By use of the ` summarize `

function in the ` dplyr `

package, we compute the mean runs value and the number of plate appearances for each pitcher hand for each season — the output is stored in the data frame ` Runs.Season.Pitcher `

. Likewise, we do a similar summary of the runs values for Season and that is stored in the data frame ` Runs.Season `

.

Runs.Season.Pitcher <- summarize(group_by(Ryan, Season, PIT_HAND_CD), Mean=mean(RUNS.VALUE), PA=length(RUNS.VALUE)) Runs.Season <- summarize(group_by(Ryan, Season), Mean=mean(RUNS.VALUE), PA=length(RUNS.VALUE)) Runs.Season$PIT_HAND_CD <- "Overall"

Using the ` ggplot2 `

package, we plot the left and right pitcher hand mean runs values for each season. The middle curve is the overall mean run values. As expected, we see a steady decrease in Howard’s runs value. It is interesting that Howard did well against lefties in his breakout year (2006). Howard did poorly against lefties in 2012, but he also did not do well against right-handed pitching.

To focus on the platoon effects, we first merge horizontally the data frames for right and left-handed pitching. Then we use ` ggplot2 `

to graph the difference in mean run values against season.

Merged.Runs <- merge(subset(Runs.Season.Pitcher, PIT_HAND_CD=="R"), subset(Runs.Season.Pitcher, PIT_HAND_CD=="L"), by="Season")

Although there is a sizable season-to-season variability in the size of the platoon effect, it seems pretty constant about 0.1 and I would anticipate it to remain about 0.1 in 2014. Actually, I think the Phillies should hope that Howard’s performance against righties would improve next season.

Okay, we’ve demonstrated that Howard has had a large platoon effect over his career. Will this impact how the Phillies play him this season? Let’s look at past seasons and compute the fraction of PA’s that Howard has hit against right-handers.

Merged.Runs$Fraction.Right <- with(Merged.Runs, round(PA.x / (PA.x + PA.y), 2)) Merged.Runs[, c("Season", "PA.x", "PA.y", "Fraction.Right")] Season PA.x PA.y Fraction.Right 1 2005 285 63 0.82 2 2006 479 225 0.68 3 2007 402 246 0.62 4 2008 435 265 0.62 5 2009 451 252 0.64 6 2010 404 216 0.65 7 2011 459 185 0.71 8 2012 186 106 0.64 9 2013 230 87 0.73

In the seasons 2007 through 2010, about 62-65% of Howard’s PA’s were against righties which indicates he was a full-time player. In 2005 and in recent years, this percentage has shown an increase, indicating that the Phillies were resting him against left-handers. It will be interesting to see how the new Phillies manager plays Howard in the 2014 season.

One interesting thing about Ryan Howard is that teams typically employ a fielding shift where they have three infielders play between 1st and 2nd bases. I don’t know if data on fielder location is publicly available, but it would be interesting to explore the effectiveness of this defensive strategy.

By the way, if you have downloaded the ` Ryan.csv `

play-by-play data and the dataset rests in the current working directory, then you can replicate this R work by typing

library(devtools) source_gist(9389168)

Great post!! How can check the code of ggplot2?

thanks.

Marti, sorry I didn’t mention it. Here’s the listing of the code:

Jim, Thank you so much. Great post and job!!!