Introduction
In our current Statcast world, there is a lot of talk about exit velocities and launch angles on balls put in play. Part of the value of these measures is entertainment — one can be impressed with a home run is announced to have a 110 mpg launch speed. From a statistical perspective, these “off-the-bat” measures (maybe we should call them Statcast measures) appear to be better measures of a batter’s ability than outcome measures such as a batting average or a slugging percentage. Here I use some graphs to explore a few situational effects of exit velocity and launch angle — specifically how these Statcast measures vary as a function of the count and zone location. We have a good understanding how common batting measures such as batting average vary in terms of count and pitch location and one would think that similar patterns occur for exit velocity and launch angle.
Relationship Between Count and Zone Location
For all balls in play, we record the current count and pitch location (Statcast variables pitch_x and pitch_z). To adjust for the batting side, I define an adjusted pitch_x variable which is equal to pitch_x for right-handed hitters and equal to minus the pitch_x value for lefties. So a positive adjusted pitch_x value corresponds to a pitch away from the batter and a negative value corresponds to an inside pitch.
For each of the possible counts (0-0 through 3-2), here are contour plots of the densities of the (adjusted) pitch locations on balls put in play. A couple of interesting takeaways:
- These contours generally are concentrated about a location a little away and a little lower than the center point in the zone.
- At a 0-0 count and batter’s counts (1-0, 2-0, 3-1, 3-2), all of the contours lie within the zone — it appears unusual for a batter to hit a pitch outside of the zone.
- For a fixed number of balls, say one ball, and increasing number of strikes (from 0 to 1 to 2 strikes), the contours expand and it is more likely for the in-play event to occur for a pitch outside of the zone. The contours for pitch locations for two strikes are especially wide.
- The 0-2, 1-2, 2-2 contours look pretty similar — it seems that batters will swing at a broad range of pitch locations with two strikes and the number of balls is not relevant.
Count Effects
Next I explore graphically how the mean exit velocity varies by the count. Here I plot these averages as a function of the total balls and strikes count and the label shows the count. This graph resembles a similar plot for mean batting averages. What is interesting is the large effects for more favorable batter counts. Balls at 1-0 counts are hit with an average exit velocity at 88 mph, at a 2-0 count the average speed increases to 90 mph, and at a 3-0 count, the average speed off the bat increases to 94 mph.
Next I thought it would be interesting to construct a similar graph for mean launch angles. The red line corresponds to an average launch angle for all balls put in play — this average is about 11 degrees. For batter’s counts, the launch angle increases — up to an average of over 18 degrees at a 3-0 count. In contrast, the launch angle decreases (think grounders) for pitcher’s counts. At a 0-2 count, the average launch angle is about 9 degrees.
Pitch Location Effects
One would think that batters prefer to hit balls in the middle of the zone — batted balls in this area would be hit harder (higher exit velocity) and at “good” launch angles (corresponding to line drives). In contrast, batted balls hit at locations outside of the zone would be hit softer at less-desirable launch angles. I binned the pitch location space into 13 x 13 = 169 regions and computed the mean exit velocity for balls hit from each region. Below I construct a plot of the mean launch speed for all regions where a redder color corresponds to a high mean exit velocity. We see …
- The “hot zone” for launch speed corresponds to a diagonal region from low-inside to high-outside.
- In contrast, balls hit high-inside or low-outside tend to have lower launch speeds.
Do we see similar patterns for mean launch angles? Judging by the graph below, yes we do. Balls hit at pitch locations in the middle of the zone tend to be hit (on average) at a launch angle between 15-20 degrees that correspond to line drives. Ball hit at high-inside pitch locations tend to have higher launch angles, and balls hit in the low-outside pitch locations tend to have low average launch angles (think groundballs).
Moving Forward and R Work
- From a team perspective, I think analysts would be interested in these count and pitch location Statcast effects, but they would look deeper. Specifically, they would be interested in exploring these effects for individual batters or pitchers.
- I’ve explored situational effects of measures like batting average over the years. Some situations like the home-away effect tend to be biases — they affect all of the players in the same way. The challenge is to find what I call “ability effects”. For example, look at the count effect. Are there particular batters who react to the count in an unusual (good) way? Can we find ability effects from Statcast data?
- From a statistical perspective, looking at Statcast effects for a group of players can be challenging since one may not have much data for some players. One needs to develop some method of combining separate Statcast effects, such as the use of a multilevel model.
- My R work for creating these graphs can be found on my Github Gist site. With the ggplot2 package, these graphs are relatively easy to construct given the Statcast data for a single season (these plots are based on 2017 data). Most of my R time was devoted to fine-tuning of the graphs, such as choosing an appropriate color scheme.
Very curious: I tried this myself using a different method and got very different results.
I added the count to my table as follows: statcast2017$count<-with(statcast2017,balls+stikes)
I then created a data frame of at-bats with no launch angle:
statcast2017count0)
And added balls and strikes to my data frame:
statcast2017count$bs<-with(statcast2017,paste(balls,strikes, sep=" – "))
I then used aggregate:
aggregate(statcast2017count$launch_speed,list(statcast2017count$bs),mean)
The mean speeds I got were as follows:
1 0 – 0 81.22397
2 0 – 1 80.32742
3 0 – 2 79.25532
4 1 – 0 82.46344
5 1 – 1 81.50906
6 1 – 2 79.96755
7 2 – 0 84.48888
8 2 – 1 82.77934
9 2 – 2 81.09525
10 3 – 0 86.68983
11 3 – 1 84.86170
12 3 – 2 83.27635
These are much lower than the numbers you got. It isn't my dataset's fault on it either. When I used your github code, I got the right averages. Is this an issue with aggregate?
Thanks so much–your book and website are very helpful to me.
I am pretty sure that aggregate and dplyr/summarize would give the same result. I am not sure what you mean by “at-bats with no launch angle”. I was finding the mean launch speed over all balls in play for each count.