Pitch Value Visualizations of Leaders in Statcast Era
In several posts in 2021, I displayed graphs of smoothed pitch values about the zone. Basically, a pitch value is the change in runs value when a particular pitch is thrown. Since we will be taking the pitcher’s perspective, we will actually consider the pitch value defined to be the negative of the change in runs value. For example, if there is a called strike, this is an advantage to the pitcher and the pitch value will be positive. Any hit on a pitch will generally give an advantage to the hitter (disadvantage to the pitcher) and a negative pitch value.
In the Visualizations of Pitch Value post, I displayed contours of pitch value for various pitch types such as four-seamers and sliders over the zone. In the Pitch Values, Pitch Types and Counts post, I showed how the pitch value over the zone varied by the count. For example, for a four-seamer, the region about the zone where a pitcher has a positive pitch value expands on two-strike counts. At the end of that post, I expressed the desire to explore pitch value for individual pitchers. My reluctance to do that was due to the lack of data — I didn’t think that one would get useful visualizations of pitch value for a specific pitcher using data for only one season.
Currently,we have eight seasons of Statcast data from the 2015 through the 2022 seasons and so this post will explore pitch location and pitch value for leading pitchers in this Statcast era. For a particular pitch type, we use a FanGraphs leaderboard to identify the top three pitchers with respect to total pitch value in the 2015-2022 seasons. Then we compare pitch locations and pitch values for these top pitchers for each of the common pitch types.
FanGraphs Pitch Value Leaderboard
The FanGraphs site displays pitcher leaderboards for a large number of measures. We focus on the Statcast Pitch Value leaderboard. By selecting the range of seasons from 2015 through 2022, we see a table displaying the total pitch value for these eight seasons for a variety of pitch types.
In this snapshot of this leaderboard below, we see that Max Scherzer had a total pitch value of 137.2 for his four-seamers (variable wFA) and 125.1 for his sliders (wSL). Justin Verlander had a total pitch value of 33.6 on his curve balls (wCU).
For each of the pitch types four-seamers, curve balls, sliders, changeups, and sinkers, we use this leaderboard to identify the best three pitchers with respect to total pitch value. One can represent a pitcher’s total pitch value as the sum
where is the density of the location, is the density of the pitch value conditional on the location, and the sum is taken over all pitch locations. So both the likely pitch locations and the pitch values at pitch locations are relevant in obtaining a high total pitch value. We explore both the pitch locations and the regions of high pitch values across the zone for these leading pitchers.
The Data and Graph Types
I collected all of the Statcast data for the eight seasons 2015 through 2022. We collect the variables pitcher, zone location (plate_x and plate_z), pitch type, batter side and pitch value. There are a total of 5,264,631 rows in this dataset corresponding to the number of pitches thrown in these eight seasons.
For a particular pitcher and pitch type, we display two graphs:
- A density estimate of the pitch locations about the zone. The yellow region corresponds to the area where 50% of the pitches are located and the green region is where 80% of the points are located.
- A generalized additive model fit is used to smooth the pitch values about the zone and a filled contour graph is used to display these smoothed pitch values. A coloring scheme is used so that red/orange corresponds to regions where the pitcher is very effective, yellow is neutral, and blue corresponds to regions where the hitter has an advantage.
Gerrit Cole, Justin Verlander and Max Scherzer are the top three pitchers with respect to total value on their four-seamers. The left graph below compares the pitch locations of four-seamers thrown by these pitchers — generally it appears that these four-seamers are targeted in the zone. Looking at the right display, we see some interesting differences in the pitch values thrown to left and right-handed hitters.
- Generally these four seamers are effective in the top and outside portions of the zone.
- Against right handed hitters, Cole and Verlander are equally effective on four-seamers thrown in the outside region of the zone.
- Scherzer’s effectiveness against left-handed hitters seems best in the upper region of the zone. Verlander, in contrast, does well in the upper and lower regions against left-handed hitters.
The top three pitchers for total pitch value on curve balls during the Statcast era were Charlie Morton, Corey Kluber and Stephen Strasburg. Looking at the pitch locations, they follow the familiar upper-left to lower-right orientation for right-handed pitchers. Approximately half of these pitches fall in the lower right region outside of the zone. What about the associated pitch values?
- Against left-handed batters, Morton’s highest pitch values are low and outside. In contrast, Kluber’s highest values are outside and high-inside. Strasburg’s are highest middle and low in the zone.
- Against right-handed batters, Morton’s and Kluber’s best pitch values are low and outside. Strasburg’s best values are in the middle of the zone.
Clayton Kershaw, Jacob deGrom and Max Scherzer have the highest total pitch value on their sliders. Many of these pitches, like curve balls, fall outside of the zone. With respect to pitch value
- Kershaw shows high value throughout the zone against left-handed hitters, low and inside against right-handed hitters.
- deGrom has high value in the middle of the zone against lefties, low and outside to righties.
- Sherzer has high pitch values towards the bottom of the zone against righties. Interestingly, against lefties, Sherzer is strong high in the zone, but weak in the lower-outside region (this is the blue region in the contour plot).
Kyle Hendricks, Luis Castillo and Zack Greinke achieve the highest total pitch value with changeups in the Statcast era. Similar to the other off-speed pitches, many of these changeups are located outside of the zone. With respect to pitch value …
- Hendricks has high pitch value low and outside (to both left and right-handed hitters).
- Castillo has a similar pattern to left-handers, but achieves good value low in the zone to right-handers.
- Greinke is unusual in that he has high pitch value low and inside to right-handed hitters.
Chris Bassitt, Sandy Alcantara and Zach Britton have the highest cumulative pitch value for sinkers. Generally, sinkers are thrown middle of the zone for these three pitches, although Britton appears to thrown lower in the zone. With respect to pitch value …
- Bassitt has highest value on the outside part of the zone (to both sides of hitters).
- Alcantara has high pitch value inside and low to both sides.
- Britton has good pitch value in the whole zone for left-handed hitters and low outside to right-handed hitters.
Here are some useful packages for constructing these displays.
geom_hdr()function from the
ggdensitypackage is helpful for constructing the bivariate density estimates of pitch location. I like the fact that the yellow region has a clear interpretation of containing about 50% of the values.
compute_pitch_values()function from the
CalledStrikepackage is used to compute the pitch values from a Statcast dataset.
gam()function from the
mgcvpackage is used to fit a generalized additive model to the pitch value as a smooth function of the location. From the fit, the predicted pitch value is computed on a 50 x 50 grid of points, and the
geom_contour_fill()function from the
metRpackage is used to construct the contour graph of pitch values.
There are two aspects to throwing an effective pitch of a particular type, say a curve ball. First, the pitcher needs control — he needs to throw the pitch in a good location, say low in the zone. Second, the quality of the pitch is important — this includes the pitch speed and movement. A pitch value is essentially a measurement of the quality of the pitch. So an understanding of both the pattern of pitch locations and the pattern of pitch values at different locations are important in studying the effectiveness of a great pitcher.
These location and pitch value displays are potentially useful, but more experimentation is needed. Here are some questions for future study.
- These graphs use information from a large number of pitches of a particular type thrown by a pitcher. What is the minimum sample size to get useful displays?
- Can these displays be helpful in studying pitch value for a single pitcher season? If not, what alternataive methods can be used to glean information about pitch value?