MLB baseball games are getting longer and one contributing factor is the increasing length of plate appearances measured by the number of pitches. It seems that “plate discipline” is becoming an important attribute of a hitter — for example it is currently rare for a hitter to swing at the first or second pitch. A plate appearance consists of movements or transitions between the various counts (0-0, 1-0, 0-1, 1-1, 2-0, etc.) and one can understand the dynamics of a plate appearance by the probabilities of these count transitions. In this post, we’ll use Retrosheet play-by-play data to compute these transition probabilities and see how these probabilities have changed in the last 20 seasons of baseball. This work motivates the consideration of two new pitch metrics that we’ll use to compare pitchers for the 2019 season.
Graph of the Transition Probabilities
In a plate appearance, one starts with a 0-0 count and moves through the various counts until there is a strikeout, a walk/hit-by-pitch, or a ball put into play. We’ll lump all of these outcomes into a state called “end of PA”. Including the beginning (0-0) and end states, there are 13 different states of a plate appearance.
Using data from the 2020 season, here is a graph of these transition probabilities. The 12 possible counts (0-0, 1-0, 0-1, etc) are represented by red solid squares and the connecting lines represent transitions between the counts. We use brown lines to represent transitions where one ball is added to the count, and green lines represent transitions where one strike is added to the count. The red numbers along the lines give the probabilities of these transitions and the bold blue numbers give probabilities of a “end of PA” event. To help understand this graph, look at the 1-1 count. We see there is a 0.36 probability of moving to a 2-1 count, a 0.45 probability of moving to a 1-2 count, and a 0.2 probability of ending the PA with a ball in play. On two-strike counts, the red number above the square represents the probability of a foul and remaining at the same two-strike count.
Here are some interesting comments from examining this graph.
- From most counts, it is more likely to have an added strike than an added ball.
- It is unlikely for the count to end at one pitch, two pitches, and three pitches.
- Even at a 3-2 count, there is a good chance (probability of 0.27) of remaining at 3-2 and only a 0.73 probability of ending the PA on that pitch.
How have these transition probabilities changed, say in the last 20 years. Here is a graph of the transition probabilities for the 2000 season.
Comparing the 2000 and 2020 season transition probabilities, there are a couple of things to notice. First, the chance of an additional strike for zero and one-strike counts is greater in 2020. For example, on a 1-1 count, the probability of an added strike is 0.45 in 2020 compared with a probability of 0.40 for the 2000 season. Another general observation is that, for the zero and one-strike counts, the probability of a ball in-play has dropped in the 20 seasons. For example, for a 1-1 count, the probability of in-play is 0.2 in 2020 compared to 0.22 in 2000.
A Historical Look
There are two primary takeaways from our comparison of the 2000 and 2020 seasons. First, the probability of an added strike on a 0 or 1 strike count has increased in 2020, and the probability of ending the PA has decreased in 2020 for 0 or 1 strike counts. To see if there is a general pattern, we used the Retrosheet files to find all of the transition probabilities for the seasons from 2001 through 2020. Since we are focusing on changes, we compute for all counts two metrics: (1) the change in the probability of an added strike since the 2001 season and (2) the change in the probability of an “end of PA” since 2001.
We plot the change in added strike against the season below for all 0 and 1 strike counts and use loess smoothing curves to see the general pattern. We see that the Probability of an Added Strike has steadily increased for all starting 0 and 1 strike counts over this 20 year period. The greatest changes in the probability of an added strike seem to occur for the three-ball counts. For example, on a 3-1 count, the probability of an added strike has increased by 0.06 over this twenty-season period.
Here is a similar graph that plots the change (since 2001) in the Probability of a “End of PA” for all starting counts. For zero or one strike counts, the probability of an end of PA event has been steadily decreasing. For example, the chance of an end of PA event for a starting 3-1 count has decreased by over 0.05. For two-strike counts (0-2, 1-2, 2-2, 3-2), there has been little change in the probability of an ending PA event.
Pitch Transitions for Individual Pitchers
We are very familiar with the strikeout leaders in baseball. Motivated by the count transition work, it might be interesting to look for pitchers who are extreme with respect to two transition metrics:
- Added Strike Rate = the probability of an added strike on a count with 0 or 1 strikes. (Notice by excluding an initial two strike count, this probability does not include the added strike producing a strikeout.)
- In Play Rate = the probability of a ball in play on a count with 0 or 1 strikes
Below I have constructed a scatterplot of these two metrics for all pitchers who have thrown at least 2000 pitches in the 2019 season. As one might anticipate, the two metrics have a negative association — a high added-strike rate is generally associated with a low in-play rate. I’ve identified pitchers (CP, Jd, GC, JV, MS) who are best in getting an additional strike — I would suspect that the interested reader would be able to identify these pitchers from the initials. I would anticipate that these best added-strike pitchers are also the best in strikeout rates. The pitchers who are low on the added strike rate are less familiar — they are Antonio Senzatela (AS), Dakota Hudson (DH), and Brett Anderson (BA).
(By the way, if you are still wondering about the identities of the high added-strike pitchers, they are Gerrit Cole, Jacob deGrom, Chris Paddack, Max Scherzer and Justin Verlander.)
Related Posts on Count Patterns
The count is an important aspect of a modern baseball game. Over the years, I’ve written different posts focusing on various aspects of count patterns within a plate appearance.
- Graph of Pitch Count Transitions. Here I show how one can model these count transitions by use of a Markov Chain.
- Number of Pitches in a PA. Here I explore the runs benefit from a batter’s perspective of lengthening the plate appearance.
- Sequences of Pitch Counts Here I dive into the R code for computing these count transitions from Retrosheet data.
- 2018 Retrosheet Data and Length of a PA. Here I explore the hitter advantage when he puts a ball in play for different counts and different lengths of plate appearances.
I am working on a R package PitchSequences on my Github site where I have collected my functions for working with pitch sequences. Given a Retrosheet play-by-play data frame, one can compute all of the count transition probabilities and produce the transition graph above by use of the function