I don’t typically post on consecutive days, but Cliff Lee had a remarkably poor start against the Texas Rangers yesterday and we want to know why. Given the easy availability of data on Lee’s pitches using the
pitchRx package, let’s look at the locations of Lee’s pitches, specifically the ones that were put in play.
All of the R code for this quick exploration can be found on the gist site.
Here’s an outline of what I did:
- I scraped all of the pitchRx for yesterday’s game and stored the data in a list.
- Using two applications of the
selectfunction from the
dplyrpackage and the
inner_joinfunction, I collected the variables of interest from the
- Using the
subsetfunction, I only consider the pitches that result “in-play” and created a variable
resultthat is either “OUT” or “HIT”.
- I use the
ggplot2package to graph the locations of the pitches, with separate panels for the side of the batter.
This graph should help Lee and the Phillies understand what happened yesterday.
- Lee was pretty effective against left-handed batters, but the hits corresponded to pitches away in the strike zone.
- Most of the damage was by right-handed batters. Lee had too many pitches in the middle to low sections of the strike zone. He seemed more effective with high pitches.
Based on this brief exploration, I think Lee’s main problem yesterday was his pitch location. Batters seem to expect Lee to throw strikes and some of these pitches were too much in the middle of the plate.
How did you scrape the data for Cliff Lee?
I tried using the following code after loading the pitchFx package:
dat <- scrape(start="2014-04-01", end="2014-04-02")
And I got the following error message:
If file names don't print right away, please be patient.
Error in function (type, msg, asError = TRUE) :
Could not resolve host: 3; Host not found
However, if I change the year to "2013" I can scrape the database…
Do you have the most recent version of the pitchRx package (version 1.3)? I think there was some problem accessing 2014 data with an older version of the package, but it was fixed for this version. (I just scraped pitches from yesterday’s games.)
Thanks, that cleared it up. I had to install the package manually from source on my Mac since the Cran is still showing version 1.2.