One of the greatest baseball players Hank Aaron passed away last week. Many articles have appeared recently that praise Aaron, both as a great baseball player and as a person who excelled despite the the racism he faced. Aaron’s greatness during this racial climate was evident in his pursuit of Babe Ruth’s career home run record. Many of the articles review many of Aaron’s accomplishments from a statistical perspective — for example, this mlb.com article lists 13 stats that show Aaron significance. In my research on streakiness patterns in baseball, I discovered one less-known accomplishment of Aaron — he had an unusually consistent pattern of hitting home runs. I thought it would be a good time to review some of this work, focusing on the use of a simple measure of streakiness. This demonstrates that Aaron had a consistent pattern compared with the great home run hitters of his era.
Consistent Home Run Hitting
It is easiest to start with a definition of consistent home run hitting. A baseball player has a number of opportunities (plate appearances) during a season. Suppose that the chance of hitting a home run during a particular plate appearance is a constant number p and also assume that outcomes of different PAs are independent. Consider the spacings, the number of PAs between consecutive home runs which we call Y1, Y2, etc. For example, if a player hits home runs on PA numbers 10, 15, 40, 50, and 100, the values of the spacings would be Y1 = 4, Y2 = 24, Y3 = 9, and Y4 = 49.
Under our assumptions, the Y’s have a geometric distribution with probability of success p. A hitter is defined to be truly consistent if his pattern of spacings (gaps between home runs) resembles a geometric distribution.
There are nice properties of a geometric distribution. The mean is given by M = (1 – p) / p and the variance is given by Var = (1 – p) / p ^ 2. A simple calculation gives that Var / M = 1 / p where p = 1 / (M + 1). Continuing, one can show for a geometric distribution that
Var / (M (M + 1)) = 1
A Measure of Streakiness
If a player is genuinely streaky, his chance of hitting a home run will not be a constant probability value. During some periods of the season, this streaky hitter will be hot and have a high home run probability and for other periods, he will slump and have a low home run probability. The distribution of his spacings (gaps between home runs) for a streaky hitter will not be geometric. Instead the spacings between home runs for a streaky hitter will show higher variability that one would expect based on a geometric distribution. We saw for a truly consistent hitter in the geometric setting, we have Var / (M (M + 1)) = 1. If the player is streaky, then we anticipate a higher variance among the spacings and so a reasonable measure of streakiness is
Measure = Var / (M (M + 1))
where we estimate Var and M by the sample variance and sample mean of the spacings. If this measure is larger than one, this provides some support for true streakiness.
Hank Aaron’s Streaky Measures
How streaky was Hank Aaron in his home run hitting? Aaron played for 23 seasons. For each season, I found the spacings or gaps between consecutive home runs, computed the mean and variance of the spacings, and computed the streakiness measure. I’ve graphed the values of this measure below — the blue horizontal line corresponds to what we expect for a truly consistent (geometric) hitter. Note that Aaron’s values tend to fall below 1 and the measure exceeds 1 for only four seasons. So Aaron clearly did not have a pattern of streaky hitting. Actually his home run spacings tend to have less variability than one would predict from the consistent geometric model.
Comparing with Other Sluggers of Aaron’s Era
Since I don’t have a lot of experience with this particular measure of streakiness, I don’t have a good understanding how to interpret a particular value like 0.8, and so perhaps my measure is more useful in the comparison of hitters. Compared to other power hitters in Aaron’s era, was Aaron streaky or consistent?
For each of the 23 seasons from 1954 through 1976, I found the twenty hitters with the most home runs. For each player for each season, I found the spacings between consecutive home runs and computed my streaky measure. Below I have graphed the streaky measure values for these sluggers for each season and indicated Aaron’s values as red dots. For each season, look at the relative position of Aaron among the 20 top sluggers. Note that Aaron tends to be one of the smallest values for many of the seasons. The takeaway is that Aaron’s spacings between home runs tended to be less variable than the spacings of other sluggers of his era.
What Have We Learned?
Let’s contrast our findings with the familiar home run accomplishments of Hank Aaron. Aaron’s home run statistics are interesting for a number of reasons.
- Aaron hit a career total of 755 home runs over 23 seasons.
- Although he was a great home run hitter, Aaron never had more than 45 home runs in a single season.
- But there were 11 seasons where Aaron hit at least 35 home runs.
- Aaron was not a traditional home run hitter in the sense that he tended to hit line drive home runs with smaller launch angles than the modern sluggers. (I suppose we could check this by watching some videos of Aaron’s home runs.)
Aaron’s home run accomplishment described in this post is more subtle. We are finding that Aaron’s pattern of hitting home runs is more consistent or more evenly spread out than most of the home run hitters of his era. To get a sense of what this means, it is helpful to construct a “streaky plot” that displays the locations (PA numbers) of the home runs by vertical bars. Using these plots, contrast Aaron’s consistent pattern of home run hitting in 1967 (Measure = 0.687)
with the streaky home run hitting pattern of the 1967 Jim Ray Hart (Measure = 1.57):
One can find hitting streaks in the Stathead section of Baseball Reference, but you won’t see any measures posted that summarize the streaky or consistent pattern of hitting performance. Does that mean that baseball coaches don’t care about these patterns? Well, we all know that teams go through periods of hot and cold hitting and one job of a coach is to manage these patterns of streakiness. I would think teams would value players like Hank Aaron that displayed a consistent pattern of hitting — that would help to offset the cold hitting periods of other hitters.
To Read More
I wrote a Chance paper a few years ago that explored streaky patterns of home run hitting. I used different measures of streakiness, but I reached the same conclusion in Section 5 of that paper that Aaron had a unusually consistent pattern of home run hitting.