In last week’s post, we reviewed the reasons for the large increase in home run hitting during the Statcast period from 2015 to 2019. There are two primary reasons for the HR increase, changes in the composition of the ball and changes in batter behavior, specifically changes in the launch angle and exit velocities of balls put in play. Since both the changes to the ball and the changes in the launch conditions contribute to the HR exposition, it is of interest to isolate the two effects. By use of modeling, we can actually describe the carry effects of the baseball for one season and predict a player’s home run performance for the following season if that previous season’s ball was being used. I thought it would be interesting to run this prediction exercise for all players. We are going to focus on two questions.
- First, are players generally changing their batting behavior to hit more home runs in each transition from one season to the following season?
- Second, it would be interesting to identify the “extreme” players who really boosted their home run production or really hurt their home run production by changes in launch conditions. This would help to add some quantitative evidence for the stories described in Jared Diamond’s Swing Kings book.
The Modeling Exercise
Let’s focus on two seasons — say 2015 and 2016. I focus on the roughly 200 players that had at least 200 balls in play for both seasons.
- To get an understanding of the 2015 ball effect, I fit the generalized additive model
logit(Prob(HR)) = s(LA, EV) + b
where s(LA, EV) is a smooth function of launch angle (LA) and exit velocity (EV) and b is a random effect specific to each player. The player random effects are assumed normal with mean 0 and standard deviation sigma. Certainly, a player’s propensity to hit a home run depends on more than launch angle and exit velocity and these random effects soak up these remaining sources of variability of home run hitting (such as spray angle) that distinguish players.
- We use this 2015 ball model to predict the probability of a HR for each BIP in the 2016 season using the 2016 launch variables. For each player, by summing these HR prediction probabilities, we can compute an expected number of 2016 home runs and also the expected HR per BIP.
- Obviously the number of BIP for a player will be different for the two seasons. So we ask the question, if a player had 400 BIP, say, in the 2015 season, how many HR would we predict for the 2016 season in 400 BIP with the 2015 ball and the new 2016 launch conditions. We compute a Change measure defined as
Change = Expected Number of 2016 HR minus 2015 HR
This variable tells us how many additional or fewer HR this player hits due to changes in launch conditions from the 2015 to 2016 seasons. If Change is positive, this indicates that the batter’s swing behavior changed in a good way between the two seasons to hit more HR.
We did this modeling exercise for each pair of two seasons during the Statcast era. Below for each comparison, we plot the previous season’s HR against the increase in HR production the next season due to changes in launch conditions. Remember that we have controlled for any ball effect, so the large or small changes are attributed solely to batter behavior.
Before we look at individual players, let’s focus on the general pattern. Points above the red line correspond to players who had an expected increase in home run production due to changes in launch conditions. As a summary measure, we compute the percentage of points above the line and get this table. For all hitters who had at least 200 BIP in both 2015 and 2016, 64% of them hit more home runs in 2016 due to changes in launch conditions.
Interestingly, we see a significant boast in HR production due to launch conditions in 2016, 2018, and 2019, but not in 2017. This observation is consistent with what our committee found in the MLB report in December 2019. Even if the carry of the ball doesn’t change in future seasons, I still would anticipate an overall increase in the HR total due to changes in batter behavior.
Of course, the unusually large positive or large negative player changes are of interest. Here are the players who exhibited the largest increase in home run production between two seasons due to launch conditions. It is interesting that three Red Sox players, Mookie Betts, Xander Bogaerts, and J.D. Martinez are on this top-five list although Martinez wasn’t a Red Sox player in 2017.
What Happened with Mookie Betts in 2018?
I have to comment on Mookie Betts since the table indicates that Mookie was predicted to hit 65 home runs in 2018 — what is going on? (Note that Betts only hit 32 HR in 2018.) First, this calculation is based on his 2017 BIP and he had 120 additional BIP in 2017 compared with 2018. Second, Mookie really had an upswing in launch conditions in 2018. Looking at Bett’s Baseball Savant page, I see big increases in launch angle and exit velocity in 2018. MLBAM has a measure called barrels that summarizes desirable hard hit balls at a good launch angle and his barrel percentage was 14.1 compared with 4.5 in 2017.
Other Special Players
On the other end, here are the five players who displayed the greatest drop in home runs due to weaker values of launch angle and exit velocity. I think some of these values are due to injuries (I am thinking about the 2019 Khris Davis) that had a negative impact on their batter behavior. Here’s a recent article that describes Davis to be a rebound candidate.
Some Closing Comments
- With the availability of Statcast data, we are able to provide new measures of batting performance. Some examples of these new measures are the “expected” batting measures such as Expected BA, Expected wOBA, etc.
- These new measures can be used to isolate particular batting or pitching skills. For example, my Change measure actually goes one step further by controlling season to season changes due to the carry ball effect.
- I think there are many opportunities to develop new metrics of pitching and batting performance using Statcast data. Recently, I just saw a paper “The Prediction of Batting Averages in Major League Baseball” (available for download here) that implements regression methods and Statcast data to develop improved prediction methods.
- All of the R code for these computations is available on my Github Gist site.