I’m currently introducing a normal sampling model for continuous data in my Bayesian class and the motivating example in my text are heights of adults of the Dobe area !Kung San people. I decided instead to model heights of MLB players in my class and that motivated this exploration seeing how MLB players have grown (both in terms of height and body mass index) over the history of professional baseball.
Distribution of Birth Years
A convenient way to partition players is by the year of birth. In the following graph, I plot the number of MLB players (from the Lahman database) against birth year. Since the number of teams has increased over years, it is not surprising that there are more players in modern baseball, but there are some interesting patterns between birth years of 1900 and 1950 that deserve some explanation. (The drop in the number of players born in the 1990’s is not surprising since many of these players are still in the minor leagues.)
MLB Players are Getting Taller
Next I computed the mean height of all players that were born in different years and graphed the mean height against birth year. In the below graph I see a steady growth in the mean height. I’m 6 feet (72 inches) and this was pretty typical for players born around 1920. But the average ballplayer born in 1990 is, on average, 73.7 inches tall.
Change in Average Body Mass Index
Nowadays body mass index (BMI) is used to measure obesity of adults. It is defined by BMI = Weight / Height ^2 * 703, where 703 is a constant to account for the change in units. To estimate an average BMI for a particular group of players, say those born in 1978, I plot values of (height^2, weight) and fit a line through the origin — the estimated slope provides a BMI value. In the graph below, the least-squares fit has slope 0.0376 which translates to an average BMI of 26.43.
I repeat this procedure for many birth years and below I graph the BMI estimate across birth year. This is interesting. It seems that the average BMI stayed around 24.5 for a long time, but from 1960 to 1985 this average BMI rose rapidly to a peak value of 27.2. (I believe this happened during the Steroids Era.) Also it is interesting that the BMI estimate has been decreasing in recent seasons.
To look further, the below graph compares the distribution of BMI for players born in the 1968, 1973, 1978, 1983, and 1988 seasons. The red horizontal line is at 25 which is the cutoff between “normal weight” and “overweight” by the BMI classification. For players born in 1968, a value of 25 was pretty typical. In contrast, a large proportion of the players born in 1988 exceed this cutoff value.
Since player sizes seem to vary by fielding position, it seems worthwhile to explore how the changes in mean heights and BMI of players over time vary by position. These graphs were straightforward to construct.
For Heights, first-basemen and pitchers tend to be the tallest, and the second-basemen and shortstops tend to be the shortest. For all positions, we see the same general trend of increase in height.
With respect to mean BMI, the first basemen and catchers tend to have the highest values, and the shortstops and second basemen tend to have the smallest values.
The R work for this exploration was relatively easy using the Master and Fielding datasets in the Lahman package. It would be interesting to see how the heights and BMI of baseball players compare (over time) with players in other professional sports such as football or basketball.