In the last post, I described some methodology for detecting streaky and consistent team performance. One observes a team’s wins and losses (L) — for example the 1968 Mets had a 73-89 record and we observe the sequence of wins and losses during the season. There are three scenarios:
- The sequence of 73 Wins and 89 Losses is random in the sense that the streaky and slump patterns resemble those when the 73 Wins are assigned randomly throughout the 162 games.
- The sequence of wins and losses is streaky if there is more streaky and slump patterns that one would expect under randomness.
- The sequence of W’s and L’s is consistent if there is less streaky and slump patterns than one would anticipate under randomness
We use a streaky measure which is the sum of squares of the gaps (number of losses) between consecutive wins. Then we use a permutation test and the associated p-value to classify the team-season as random, streaky, or consistent. Small p-values correspond to streakiness and large p-values correspond to consistency.
To explore team streakiness, I looked at all of the W/L sequences for all teams from the 1967 through 2016 seasons. Let me describe the process:
- Using the Retrosheet game logs, I obtained the W/L sequence for a particular team.
- I used functions in my BayesTestStreak package to find the gaps between consecutive wins and implement the permutation test.
- For each team and each season, I collected the winning proportion and the p-value which indicates the streaky patterns in the sequence.
Several questions to address:
- What is the overall distribution of streakiness of baseball teams?
- Who were unusually streaky and consistent teams?
- Is there any relationship between a team’s winning fraction and their streakiness? For example, are great teams more or less likely to be consistent?
Overall distribution of streakiness
First I constructed a histogram of the p-values for all team-seasons. (There were 1358 team-seasons in my 50-year study.) To understand this graph, note that if we were just observing W/L patterns observed by flipping coins, then I’d expect these p-values to be uniformly distributed on (0, 1). But it seems that small p-values tend to be more likely, indicating a tendency for a team to be streaky rather than random or consistent.
One can rank the teams by the value of the p-value. The streakiest 10 teams, using this measure are shown below. It is interesting that great teams (like the 1977 Red Sox) or poor teams (like the 2011 Mariners) both can be very streaky.
Season Team P_Value Win_Pct
1 1998 BAL 0.001 0.4876543
2 1987 MIL 0.002 0.5617284
3 1994 OAK 0.003 0.4473684
4 2011 SEA 0.003 0.4135802
5 1977 BOS 0.004 0.6024845
6 1982 ATL 0.005 0.5493827
7 2007 SEA 0.005 0.5432099
8 1981 DET 0.006 0.5504587
9 2004 BAL 0.006 0.4814815
10 1995 HOU 0.007 0.5277778
Similarly, here are the 10 most consistent teams, ranked by p-value. Again, note that good and poor teams also can be consistent. It would be interesting to look more carefully at some of these seasons to try to detect reasons for the consistency or streakiness.
Season Team P_Value Win_Pct
1 1968 SFN 1.000 0.5426829
2 1968 NYN 1.000 0.4506173
3 2005 SLN 1.000 0.6172840
4 1993 CHN 0.999 0.5185185
5 2014 LAN 0.999 0.5802469
6 1977 MIL 0.998 0.4135802
7 1993 KCA 0.997 0.5185185
8 2003 CHN 0.996 0.5432099
9 1971 CAL 0.995 0.4691358
10 1983 CIN 0.995 0.4567901
Winning fraction and streakiness
To understand the relationship between winning records and streakiness, I subdivided the p-values into the bins (0, .1), (.1, .2), …, (.9, 1), and compared the winning fractions of the teams in these 10 bins. Here I display parallel boxplots. A p-value of 0.5 is what one might expect for a random sequence. Generally teams have p-values that are centered about 0.5, but teams with winning fractions between .35 and .40 average a p-value of .40, and the really weak teams (winning fraction under .35) tend to be streaky. This suggests that poor performance tends to be associated with streaky performance.
Of course we get excited by streaky patterns such as the Indians’ 22-game winning streak in 2017. Likewise, there is much said about long streaks of hitting or long streaks of futility (the well-known “ofer” statistics). But I think there is a lot to say about the virtue of consistency — these correspond to teams or players that avoid streaky patterns and exhibit patterns that are different from those of “randomness”. I would think that teams would value consistent players, although we don’t routinely measure this aspect of performance.