Statcast Individual Count Effects


In last week’s post, I showed that balls put into play are significantly impacted by the count and the zone location.   In particular, I focused on the launch speed and launch impact — batters tend to hit harder balls in favorable counts and similarly they hit harder balls for pitches in the middle of the zone.  The count and zone location also impact the launch angle of the batted ball.

At the end of the post I asked about individual (player) effects:

“I’ve explored situational effects of measures like batting average over the years.  Some situations like the home-away effect tend to be biases — they affect all of the players in the same way.  The challenge is to find what I call “ability effects”.  For example, look at the count effect.  Are there particular batters who react to the count in an unusual (good) way?  Can we find ability effects from Statcast data?”

In this post I will explore the player-effect issue where launch speed is the variable and the situation is the count — I’ll compare “ahead in the count situations” (counts 1-0, 2-0, 2-1, 3-0, 3-1, 3-2) and “behind in the count situations” (counts 0-1, 0-2, 1-2).  Here I will ignore the neutral counts (0-0, 1-1, 2-2).  By the way, I defined the “ahead” and “behind” count situations by simply observing the impact of these counts on launch velocities.

2017 Statcast Data

I start with the 2017 Statcast batting data.  I only consider the balls put in play and only consider players with at least 200 balls in-play.  There are 286 batters in this group.  For each player I compute two means:  (1)  the mean launch speed for “ahead” counts M_ahead and (2) the mean launch speed for “behind” counts M_behind.  Below I construct a scatterplot of the average (M_behind + M_ahead) / 2 against the count average M_ahead – M_behind.  (This is called a Tukey Mean-Difference plot.) I’ve added a red line at y = 0 corresponding to no count advantage.

Several observations:

  • There is a lot of variability in mean launch speed (think of Billy Hamilton and Nelson Cruz as extreme hitters at the low and high ends, respectively).
  • There tends to be a significant advantage to being ahead in the count.
  • But there is much variability in the count advantage — a number of hitters actually do worse ahead in the count.  (That raises the question –are these negative effects reflective of the hitters’ abilities in this situation?)


Comparing Two Seasons

In Curve Ball (remember this book?) I distinguished between what I called an observed effect and an ability effect.  In the above plot, we see observed count effects.  Some batters hit 8 mph harder when they are ahead in the count and other players actually hit 2 mph softer when they are ahead in the count.  These observed effects are hard to interpret since there are two different sources for the variability.  Players have different abilities to do better or worse in the situation and those abilities can contribute to these effects.  But these observed effects are also affected by luck or chance variation.  The important question is “how much of the total variation in the observed effects is due to the difference in abilities?”   Generally we are more interested in player abilities than the noise or chance variation in the data.

One way to get some insight on the existence of ability splits is to look at the effects for two consecutive seasons.  If the effect is ability-driven then one would anticipate that the observed values for one season to be highly correlated with the values for the second season.  Here I also collect the mean launch speeds for “regulars” (that is, players with at least 200 balls put in play) for the 2016 season and merge this data with the 2017 launch speeds.  Here is a scatterplot of the 2016 and 2017 mean launch speeds.  I see a strong relationship which makes sense since we know that launch speed is a good measure of the ability of a hitter.  Notice the three points in the lower left section of the plot?  These players (including Billy Hamilton) don’t generate much velocity off of the bat and I’d predict that they would perform similarly in the 2018 season.


In passing we notice that a majority of the points tend to fall below the line y = x indicating that the mean launch speeds in 2017 tend to be smaller than the values in 2016.  Looking closer, here is a histogram of the differences in mean launch speeds — we see that the histogram is centered about -1 which means, on average, that the mean launch speed for 2017 tends to be one mph lower than the 2016 value.  (I wonder why?)


Persistence of Count Effects

I already know that launch speed is an ability characteristic of a hitter, but I am wondering if the count effect (that is, the difference in launch speeds between ahead and behind counts) is an ability characteristic.  I explore this by constructing a scatterplot of the 2016 count effect and the 2017 count effect, adding a best fitting line.  I note several things.  First, there is a lot of scatter which indicate that the 2016 effect is not that highly associated with the 2017 effect.  For example, the players who had negative count effects for the 2016 season all had positive count effects for the 2017 season.  Second, note that the slope of the line is positive which indicates there is a small ability effect for this particular count situation.


Summing Up

What did we learn?

  • I illustrated a general method of checking for ability effects by looking at the relationship between the effect measure for two consecutive seasons.
  • Most of the variability in the observed count splits in launch speeds appears to be due to chance and not much due to the players’ abilities to take advantage of the count.
  • Other variables such as the pitcher tend to affect these count splits — it would be reasonable to think of this count effect as a bias that affects all players the same way.  (The home/away split is one example of a bias situation.)
  • There appears to be a small ability of the players to take advantage of the count situation since the effects for two consecutive seasons was slightly positively correlated.
  • I could have done this using launch angles.  Like launch speed, I believe each hitter has a launch angle ability, but I doubt that hitters have launch angle ability splits.
  • This post was exploratory without the use of any models.  One can explicitly measure the size of these ability effects by the use of a Bayesian multilevel model and I’ll talk about how this works in a follow-up post.