World Series Game 6
The World Series ended last night with the Dodgers defeating the Rays 3-1. All of the post-game talk is about the decision by the Ray’s manager to remove Blake Snell from the game in the 6th inning. Snell was replaced by Nick Anderson and the outcome was not good for the Rays — they quickly scored two runs in that half-inning.
Much of the media such as the Fox announcers was critical of the Rays’ use of analytics in making that decision. Some of the evidence was that Snell historically does not pitch as well when he faces a lineup for the 3rd time and Anderson was arguably the Rays’ best relief pitcher in the 2020 season, although he had struggled in the playoffs.
Don’t Trash Analytics
I think baseball has benefited generally with the use of analytics in making decisions on player personnel and game strategy. But I think there are two cautions in the use of analytics that should be highlighted. First, it is easy to misuse analytics — for example, it is possible that teams are asking the wrong question and so the answer that one gets is not addressing the right question. Second, analytics can’t be used to address all game decisions that are made by a manager. In many situations, one has to combine analytics information with subjective information available at the time. Some of this subjective information may be difficult to quantify but it is important and useful.
Last’s Night’s Decision
Let’s illustrate these two misuses of analytics in the context of last night’s game.
The Wrong Question. It seems that the available data was telling us that Blake Snell doesn’t pitch as well when facing a lineup for the third time. I don’t doubt this information. But this data is answering the question: “On average, how does Snell perform when facing the lineup for the third time? ” The relevant question is “How will Snell perform when he faces the Dodgers IN THIS GAME the third time through the lineup?” This is different since the relevant data includes Snell’s historical performance plus his performance in last night’s game. One should be aware of the variability in his pitching performances from game to game and that we were observing one of his best pitching performances last night. So we are not really interested in his average performance but rather his performance in one of his best games. I think the Rays’ manager was really making his decision on addressing the wrong question. Similarly, although Nick Anderson had a great regular season, he did appear to struggle in the playoffs. Again wrong question. Kevin Cash should have be more interested not in the “average” Nick Anderson, but predicting Anderson’s performance understanding that his performance does vary from game to game and the most recent pitching performances during the playoffs are most relevant.
Using Subjective Information. It seems that many people are focused on the information provided by data and there is no room for subjective input in making decisions. Scouts are aware of the usefulness of subjective input — visual information is often combined with measurements like pitch speed in making decisions. Last night, we were watching a dominant performance by Blake Snell and aspects of his performance, such as his clear dominance over the hitters, may have been difficult to quantify. But they were relevant. So I would think that even though historical data was telling the Rays manager to replace Snell, this additional subjective information about his performance in this game might cause the manager to think twice. Of course, Bayesians realize the usefulness of subjective information that we formalize by the construction of a prior distribution. But I believe there is additional information available during the game that may have been ignored in making the pitching decision.
Don’t Throw Away Analytics, Just Do it Better
The point of this post is that last night’s decision doesn’t mean that we should use analytics less in baseball. Instead, it is telling us that we should better use analytics in decision-making. Although data is potentially useful, it can be misused. As Bill James would say, the most important thing is to ask the right question and perhaps the Rays manager wasn’t really addressing the right question. This was a prediction problem, not an estimation problem where one is interested in the long-term average. Also we should think better how one can use visual information or information collected during the game in making decisions.
Some Additional Comments
After I wrote this post, I was curious about Blake Snell’s pattern of pitching in the 2020 season. I learned a few relevant facts:
- Snell had never gone further than the 6th inning in any of his previous 11 starts, so taking him out in the 6th inning was not that surprising.
- I looked at how Snell’s strikeout rate varied by inning in his 2020 starts. In the first two innings, his strikeout rate was in the 38-40% range but it dropped to about 20% in the 5th and 6th innings. So Snell did show some deterioration in performance in recent pitching which further justifies the 6th inning hook.
- It seems the bigger question is why Snell was replaced by Nick Anderson who appear to have recent struggles.
- As I said above, the challenge is how to combine data-based insight with subjective opinion that a manager gains during the game.