What Happened to Cliff Lee Yesterday?

I don’t typically post on consecutive days, but Cliff Lee had a remarkably poor start against the Texas Rangers yesterday and we want to know why. Given the easy availability of data on Lee’s pitches using the pitchRx package, let’s look at the locations of Lee’s pitches, specifically the ones that were put in play.

All of the R code for this quick exploration can be found on the gist site.

Here’s an outline of what I did:

  1. I scraped all of the pitchRx for yesterday’s game and stored the data in a list.
  2. Using two applications of the select function from the dplyr package and the inner_join function, I collected the variables of interest from the pitch and atbat data frames.
  3. Using the subset function, I only consider the pitches that result “in-play” and created a variable result that is either “OUT” or “HIT”.
  4. I use the ggplot2 package to graph the locations of the pitches, with separate panels for the side of the batter.


This graph should help Lee and the Phillies understand what happened yesterday.

  • Lee was pretty effective against left-handed batters, but the hits corresponded to pitches away in the strike zone.
  • Most of the damage was by right-handed batters. Lee had too many pitches in the middle to low sections of the strike zone. He seemed more effective with high pitches.

Based on this brief exploration, I think Lee’s main problem yesterday was his pitch location. Batters seem to expect Lee to throw strikes and some of these pitches were too much in the middle of the plate.


3 responses

  1. How did you scrape the data for Cliff Lee?

    I tried using the following code after loading the pitchFx package:
    dat <- scrape(start="2014-04-01", end="2014-04-02")
    And I got the following error message:

    If file names don't print right away, please be patient.
    Error in function (type, msg, asError = TRUE) :
    Could not resolve host: 3; Host not found

    However, if I change the year to "2013" I can scrape the database…

  2. Do you have the most recent version of the pitchRx package (version 1.3)? I think there was some problem accessing 2014 data with an older version of the package, but it was fixed for this version. (I just scraped pitches from yesterday’s games.)

  3. Thanks, that cleared it up. I had to install the package manually from source on my Mac since the Cran is still showing version 1.2.

