At the 2021 SABR Analytics meeting, I heard an interesting talk by Adie Wyner titled “Is the 3rd Time Through the Order Effect Real? Correcting for Lineup Order and Pitcher Quality Selection Bias”. This relates to the story in Game 6 of the 2021 World Series between the Rays and Dodgers. (This story was also discussed by Brian Kenny in his opening comments for the SABR meeting and is mentioned in a SI story on Francisco Lindor.). Blake Snell, the Game 6 starter, pitched great over 5 1/3 innings and was about to face the top of the Dodgers order for the third time. The Ray’s manager Kevin Cash made the decision to take Snell out of the game. Cash’s rationale for this decision was that Snell was likely to have difficulties facing the Dodgers lineup order for the third time. This decision was controversial since Snell had thrown only 73 pitches and had struck out 9 batters in 5 1/3 innings. (Snell was clearly pitching well in this WS game with a modest pitch count.) The pitcher who replaced Snell, Nick Anderson, did not pitch well and this bad performance led to a 3-1 win for the Dodgers that clinched the 2021 World Series.
The SABR talk and the World Series story both relate to the so-called Times Through the Order (TTTO) effect in baseball. The common belief is that pitchers tend to perform worse when they face batters for a 2nd or 3rd time. This discussion raises several questions:
- Do starting pitchers in MLB generally perform worse when they face batters for the 2nd and 3rd times?
- What type of variation do we see among pitchers in these TTTO effects?
- Do the TTTO effects depend on the batter lineup?
- Are these observed TTTO effects real? Can we use the observed effects to predict a pitcher’s performance in future appearances?
In this post, I’ll look at the most recent complete season (2019) and explore these TTTO effects for starting pitchers. We’ll look for general effects and look at specific pitchers (including Blake Snell) who may exhibit extreme effects.
Measure of Batting Performance
There are many ways to measure batting performance. Here I will use weighted on-base average (wOBA) since it is a good measure and easy to compute from the EVENT_CD variable in the Retrosheet dataset.
Data and Variables
I collected Retrosheet play-by-play data for the 2019 season and identified the starters for all games in this season. I created a dataset which contains for each plate appearance for each game faced by each starter
- the game id
- the pitcher id
- the batter lineup id (1 through 9)
- the wOBA value for the outcome of that plate appearance
From this dataset, I can create three new variables:
- wOBA1, the value of wOBA the first time a batter faces the pitcher
- wOBA2, the value of wOBA the second time a batter faces the pitcher
- wOBA3, the value of wOBA the third time a batter faces the pitcher
Number of Batters Faced
Before we look at TTTO effects, we should first explore the number of batters starters faced in the 2019 season. I have displayed a histogram of the batters faced below. Since we are interested in times through the lineup, I have added vertical lines at the values 9, 18, and 27 which correspond to 1, 2, and 3 complete paths through the lineup. We see a number of starts that don’t make it through two lineups. Most of the starting pitcher stints face between 18 and 27 batters. This means that most starting pitchers face each batter twice and some batters three times. It is unusual for a starter to face a batter four times.
These comments have implications on the TTTO exploration that follows. We’ll be averaging wOBA values over the 1st, 2nd, and 3rd TTTO plate appearances. Since starters tend to be replaced after fewer than 27 batters, the average wOBA for the third appearance through the lineup will tend to be computed over a reduced sample weighted over batters in higher lineup positions. This will create a bias in any comparison unless one makes an adjustment for the batter order.
TTTO Effects from the First Appearance
Let’s first focus on the change in wOBA from the first appearance. For all 2019 starting pitchers with at least 500 batters faced, I computed the difference between the mean wOBA for the 1st and 2nd appearance through the lineup and also the difference between the mean wOBA for the 1st and 3rd appearance. A scatterplot of these differences is shown below. Note that most points are located in the upper-right quadrant where both differences are positive. Generally it appears that there is a TTTO effect. I have labeled some unusual points. Jack Flaherty and Aaron Sanchez have high values for both differences. Chris Archer does remarkably well on the 3rd appearance compared to the first and Jakob Junis does well on the 2nd appearance compared to the 1st. Note that Blake Snell has a modest change from 1st to 2nd appearance, but a higher change from 1st to 3rd appearance. (Maybe that is the justification for removing Snell in the 2020 World Series game.)
Sequential TTTO Effects
Another way to explore this data is to consider the difference in mean wOBA between the 1st and 2nd appearances, and the difference in mean wOBA between the 2nd and 3rd appearances. A scatterplot of these differences is below. Here we observe an interesting negative association. Pitchers who appear to do better (on average) on the 2nd appearance (compared to the 1st) tend to do worse on the 3rd appearance compared to the 2nd. This is a similar pattern as one would see in standard illustrations of the regression effect — extreme performances in one season tend to move to the average in the following season.
Association Between Pitcher Quality and TTTO Effects
Do better pitchers tend to have a particular pattern of TTTO effects? Below I have plotted the change in the mean wOBA from the 1st to 3rd appearances against the overall wOBA mean. I don’t see any association here between the pitcher quality and the TTTO effect.
It is helpful to summarize what we have shown and what we have not shown in this exploration of TTTO effects.
- There clearly is a general observed effect between the 1st and 2nd time through the lineup. Also there is a general effect between the 1st and 3rd time through the lineup but there is likely some bias in this computation since the mean wOBA measurement over the 3rd time through the lineup is taken over an incomplete lineup weighted by the top of the lineup hitters.
- To address the bias issue, one probably should adjust these mean wOBA calculations by the lineup position. This could be done using a regression approach.
- Although we have observed some general TTTO effects, we haven’t talked about the underlying pitcher talents related to times through the order. By the use of a Bayesian model, one can divide the total variation in TTTO effects into components that are real (pitcher talent) and those that are attributable to luck variation.
- The SABR talk by Adi Wyner described an interesting application of a Bayesian model to learn more about the real TTTO effects. I haven’t seen the paper yet, but one takeaway from their work is that the deterioration of a pitcher’s performance doesn’t appear to start abruptly at the 3rd time through the lineup, but rather is a gradual trend during the 2nd time through the lineup.
- In any event, it is helpful to gain a better understanding of TTTO effects since they directly impact a manager’s decision on whether or not to replace a starter at a particular time during a game.
A R Markdown file containing all of the R work for this exercise can be found on my Github Gist site. This site also contains a file wba_wts.csv that gives the wOBA weights for the 2019 for each of the Retrosheet EVENT_CD values.