Multilevel wOBA Player Comparison


In last weeks post, I revisited Carl Morris’ exploration of Ty Cobb’s career trajectory of batting average. Morris used a multilevel model to shrink or adjust the observed AVGs towards a quadratic curve, and explored the posterior distribution of Cobb’s highest batting probability. In our work, we learned that Cobb was likely a true .400 hitter sometime during his baseball career.

Since batting average is a poor measure of batting performance, I wanted to extend Morris’ work using a better batting measure. Here’s the plan for this post.

  • I focus on a player’s trajectory of wOBA values displayed as a function of his age.
  • I fit a multilevel model (a so-called normal/normal multilevel model) on the wOBA measures. I assume that the wOBA values are normally distributed with unknown means, and then I assume the means follow a quadratic curve where both the regression coefficients and the spread (standard deviation) about the curve are unknown.

Since we are usually interested in comparing hitters, I focus on the comparison of these multilevel fits for two Hall of Fame players of interest. We will see that this modeling tells us if the quadratic fit is a reasonable fit to a player’s wOBA trajectory. In addition, it will allow us to compare the posteriors of the maximum expected wOBA for the two players.

A Multilevel Model for wOBA

For a particular hitter, at age x_j, we observe the weighted on-base percentage wOBA_j in PA_j plate appearances for all seasons of his career.

  • At the sampling stage, we assume that the observed wOBA_j is normal with mean \theta_j and standard deviation \sigma / \sqrt{PA}. Given the FanGraphs data, we can’t directly estimate the sampling standard deviation \sigma, but we will assign a reasonable value for \sigma based on Retrosheet play-by-play data.
  • At the prior stage, we assume the means \theta_j are normal(\mu_j, \tau) where the prior means follow the quadratic model \mu_j = \beta_0 + \beta_1 x_j + \beta_2 x_j^2.

At the last stage, we assign the regression parameters \beta_0, \beta_1, \beta_2, \tau weak informative prior distributions.

We focus on learning about the expected wOBA_j value, \theta_j. On one extreme, if we don’t have any information about other seasons, the best estimate at \theta_j is the observed wOBA for that season. At the other extreme, if we believe the expected wOBA_j follow the quadratic aging curve, the best estimate is the value on the estimated curve. The posterior mean of \theta_j is a compromise between these two estimates — we say that the observed wOBA_j is shrunk towards the estimate on the curve. The shrinkage percentage for the jth season is defined to be

SHRINKAGE = 100 \frac{1/\tau^2}{1/\tau^2 + PA_j / \sigma^2}.

The degree of shrinkage is controlled by the estimate at the prior standard deviation \tau. This is an example of an adaptive estimate — if the observed wOBA_j values do follow the curve, we would estimate a small value for \tau and the posterior means would resemble the estimates on the curve. On the other hand, if the observed wOBA_j have a different (non-quadratic) shape, there would be limited shrinkage towards the quadratic curve.

Comparing Joe Morgan and Barry Larkin

To illustrate the use of this multilevel modeling, we compare the wOBA career trajectories of two Hall of Fame infielders for the Reds, Barry Larkin and Joe Morgan. Larkin played from 1986 through 2004, and Morgan played from 1963 to 1984.

The first graph displays the observed, quadratic fit and multilevel estimates of the expected wOBA for the two players. Both players appear to peak in their early 30’s. Larkin’s wOBA values seem to closely match the quadratic fit. In contrast, Morgan wOBA values show a different shape where the wOBA values are unusually high in the period from 30-32, and unusually low in the periods about the middle 20’s and the middle 30’s. As a consequence, the multilevel estimates are moved closer to the quadratic estimates for Larkin than for Morgan.

The difference in shrinkage behavior is more dramatically illustrated in the following graph that displays the shrinkages plotted against the ages. The shrinkages for Larkin tend to be 75 percent contrasted with the 40 percent values for Morgan.

One can compare the expected wOBA estimates for the two players by the construction of 50% probability intervals for the expected wOBAs. This confirms that the two players had significantly different trajectory shapes for wOBA.

Who was the better player at his peak? We address this question by plotting posterior distributions for the maximum expected wOBA for the two players. Morgan’s expected wOBA peaked around 0.43 compared to 0.40 for Larkin.

Computation Notes

  • I’ve written a Shiny app that allows the comparison of two Hall of Fame batters of interest by selecting the players from two dropdown menus. When you run the app, the Estimates, Shrinkages, Posteriors, and max WOBA tabs display the four plots shown here. The R code is contained in the self-contained file app.R in my ShinyBaseball package. A live version of this app can be found here. The Shiny app is self-contained as the FanGraphs data is read from a file on my Github repository.
  • So that this function runs quickly in real time, I implement an approximate fit of the multilevel model. As mentioned before, I fix \sigma to be a reasonable estimate based on previous work and I use the laplace() function from the LearnBayes package to find modal estimates of the marginal posterior of \beta_0, \beta_1, \beta_2, \log \tau. Posterior distributions for a particular \theta_j = E(wOBA_j) is done in two steps — first I simulate values from the posterior of \beta_0, \beta_1, \beta_2, \log \tau, and then given these hyperparameter values, I simulate from the posterior of \theta_j. By the way, in the Shiny app, woba_player_function() implements the fitting and comparison_plots() does the ggplot2 graphs.
  • Since there have been substantial changes in offense over baseball history, these comparisons are most helpful when one is comparing two players who played in similar ears. So, for example, it makes more sense to compare Mickey Mantle and Willie Mays, than compare Mantle with Babe Ruth, since Mantle and Mays played in the same baseball era. Perhaps my comparison of Joe Morgan and Barry Larkin was not the best since they played in nonoverlapping seasons. I chose these two players since they were infielders for the same franchise (Cincinnati Reds) and they illustrate different shrinkage patterns of the multilevel estimates.