Exploring Pitches in Cole Hamels’ No-Hitter

As I’m a Phillies fan, I was pretty excited about Cole Hamels’ no-hitter against the Cubs on Saturday. This provides a good excuse to demonstrate the ease of using Carson’s pitchRx package to explore the 129 pitches that Cole threw during this game.

Given that Cole had several rough outings in his recent starts, it is somewhat remarkable that he pitched so well on Saturday. What happened in this particular game? It seems that there are two important factors in a pitching performance — the choice of pitches and the locations where these pitches are thrown. So we’ll focus our exploration on the types of pitches and the locations.

We first scrape the data using the pitchRx package. The list dat contains all of the pitch data for the games played that day. I combine pitch data (including type and location) from the component pitch and bat data from the component atbat . The data frame data contains information for all 129 pitches Hamels threw in this game.

library(pitchRx)
library(dplyr)
dat <- scrape("2015-07-25", "2015-07-25")
locations <- select(dat$pitch,
                    pitch_type, px, pz, des, num, gameday_link)
names <- select(dat$atbat, pitcher_name, batter_name,
                num, gameday_link, event, stand)
data <- inner_join(locations, filter(names,
                    pitcher_name == "Cole Hamels"),
                   by = c("num", "gameday_link"))

What types of pitches did Cole throw in this game?

with(data, table(pitch_type))
### pitch_type
### CH CU FC FF FT 
### 29 26  9 39 26 

We see that Cole threw 39 four-seam fastballs (FF), but he also threw 29 changeups (CH), 26 curveballs (CU), 26 two-seam fastballs (FT), and a few cutters (FC). It seems that Cole may have thrown a greater variation of pitch types than usual.

What were the outcomes of these pitches?

with(data, table(des, pitch_type))
###                            pitch_type
### des                         CH CU FC FF FT
###   Ball                       8  8  1 17 11
###   Ball In Dirt               1  0  0  0  0
###   Called Strike              5  7  1  5  6
###   Foul                       2  1  2  8  5
###   Foul Tip                   0  1  0  0  0
###   In play, out(s)            1  5  2  3  3
###   Swinging Strike           10  4  3  6  1
###   Swinging Strike (Blocked)  2  0  0  0  0

We see some interesting things from this table:

  • A good proportion of the swinging strikes were from changeups.
  • Half of the called strikes were from changeups and curveballs.

What were the locations of these pitches?

Here is easy to use the strikeFX function in the pitchRx package. We add the facet_wrap option so we get a different view of the pitch locations for each pitch type.

strikeFX(data, point.alpha=1, layer=facet_wrap(~pitch_type, ncol=3)) +
  ggtitle("Locations of All Pitches")

cole.nono1
We see that most of Cole’s changeups were low and out of the zone and many of his four-seamers were high. Note that his curve balls were all around the strike zone — I’m surprised that Cole threw a no-hitter with so much high breaking pitches.

What were the locations of these pitches where there was a swinging strike?

Using the filter function (from the dplyr package), we limit our exploration to pitches where the outcome (variable des ) included the text “Swing”.

strikeFX(filter(data, substr(des, 1, 5)=="Swing"), point.alpha=1,
         layer=facet_wrap(~pitch_type, ncol=3))+
         ggtitle("Locations of Swinging Strikes")

cole.nono2

It seemed that most of these swinging strikes were either low changeups or curveballs or high fastballs.

Cole’s strike zone?

Did Cole benefit with good umpire calls for strikes? We focus on the pitches which were a ball or a called strike. One attractive way of summarizing the locations of these pitches is to fit a generalized additive model to this data where the response is binary (either the pitch is called strike or ball) and the explanatory variables are the horizontal and vertical locations. Using the strikeFX function again, we display the fit from this model — the smoothed values (that is the predicted probabilities of a strike) are displayed using a heat map where a lighter color corresponds to a higher probability of a strike.

library(mgcv)
noswing <- subset(data, des %in% c("Ball", "Called Strike"))
noswing$strike <- as.numeric(noswing$des %in% "Called Strike")
m2 <- bam(strike ~ s(px, pz),
        data=noswing, family = binomial(link='logit'))
strikeFX(noswing, model=m2) +
  ggtitle("Cole's Strike Zone")

cole.nono3

The light blue region where the probability of a strike is high closely matches the strike zone indicating that the umpires are making reasonable calls on Cole’s pitches. The only exception is Cole is getting some benefit in balls located low in the strike zone.

It would be interesting to use pitchFX data to explore how Hamels has changed as a pitcher over his MLB career. My understanding is that Cole relied primarily on a fastball and changeup in his early years, and now he is understanding the benefits of using a wider variety of pitches.

Advertisements

3 responses

  1. Jim,

    with regards to the last graph titled “Cole’s Strike Zone,” is there a way to change the color scale to be multiple colors (say red and green)?

    I’ve done this for larger samples and it’s difficult to visualize where exactly that 50% theoretical threshold is in a single color scheme. I’ve never been able to figure out how to change colors to a non-single color scheme.

  2. Kevin:

    I asked Carson and he says that any of the scale_fill_gradient*() functions to change the color scale.

    Here’s an example:

    library(pitchRx)
    noswing <- subset(pitches, des %in% c("Ball", "Called Strike"))
    noswing$strike <- as.numeric(noswing$des %in% "Called Strike")
    library(mgcv)
    m1 <- bam(strike ~ s(px, pz, by=factor(stand)) +
    factor(stand), data=noswing, family = binomial(link='logit'))
    # geom will automatically be set to 'raster'
    strikeFX(noswing, model=m1, layer=facet_grid(.~stand)) +
    scale_fill_gradientn(colours = rainbow(6))

    Jim

  3. One alternative if you want full control over colors used, I go through how to build up these graphics on your own using filled.contour (though, it won’t be through pitchRx or ggplot2). But hopefully instructive on the models behind the figures:

    https://baseballwithr.wordpress.com/2015/06/30/houston-astros-whiffs-and-exit-velocity/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: