Monthly Archives: September, 2022

Predicting Home Runs Using Current and Past Data

Predicting Aaron Judge’s Home Runs

I recently illustrated the use of a Shiny app to predict Aaron Judge’s 2022 home run count. One uses a Beta prior to describe one’s initial beliefs about Judge’s home run probability p and then updates this prior with Judge’s 2022 data (currently 57 home runs in 616 plate appearances) used to compute the predictive distribution. The issue is that I set a pretty vague default prior Beta(5, 67) for p in the app and so this prior is overwhelmed by the 2022 hitting data in the posterior. Tom Tango in a Twitter retweet thought this prior was inappropriate — instead he thought a more realistic prior would incorporate 200 PA of the 2022 average HR rate and 1300 PA of Judge’s historical HR rate.

That raises the question: How does one reasonably weight Judge’s previous seasons HR performance and his current 2022 HR record to get an accurate prediction at his future performance? I will describe the use of R to construct a prediction experiment that helps answer this question. On the basis of the results, I will suggest a more reasonable prior for Judge’s home run probability.

The Prediction Experiment

Suppose we are looking at home run data of all players on July 1 (the midpoint) of a particular season and we wish to predict the home run rates in the second-half of the season for these players. We have collected historical counts of HR and PA for all players with at least 1000 PA in the previous four seasons. How do we combine this historical data with the current first-half home run statistics to get accurate home run rate predictions for the second half of the season? I will describe five possible predictions.

  • Current. I will ignore the historical data and just use the first half home run rates as my prediction. For example, Aaron Judge had 29 HR in 329 PA on July 1 this season for an observed rate of 29 / 329 = 0.088. Using the Current estimate, we’d just predict that Judge’s HR rate would be 0.088 for the period after July 1.
  • Old. Instead I will ignore the current season data and just use the historical home run rates as my prediction. In our example, Judge had 102 HR in 1692 PA for a rate of 102 / 1692 = 0.060 in the four previous seasons 2018-2021, and I’d predict Judge’s HR rate to be 0.060 for the remainder of the season.
  • 20/80 Mix. This represents a compromise between the Current and Old prediction methods. For each player, my prediction of Judge’s future HR rate will be a weighted average, 20% of his historical home run rate and 80% of the first-half home run rate. In Judge’s example, my prediction would be 0.20 (0.060) + 0.80 (0.088) = 0.0824.
  • 50/50 Mix. For each player, my prediction is a simple average (50/50 mix) of the historical and first-half rates.
  • 80/20 Mix. For each player, my prediction is a weighted average, 80% of the historical rate and 20% of the current rate.

By comparing the predictive performance of these five methods, we should learn if it is appropriate to only use current or only use past seasons data. Also, we should learn about good choices of mixture fractions if we wish to consider a compromise prediction method.

Using the Lahman database available in the Lahman package, I identify the players who have at least 1000 PA in the previous four seasons and collect their home run rates over this four-season period. Using the Retrosheet play-by-play files, I collect the home run data (HR and PA) for the first and second halves of the current season — I only consider players who have at least 200 PA each half. I merge this Retrosheet data with the historical data — in a typical season I have about home run rates of 120 players to predict.

I apply each of the five prediction methods. To measure predictive accuracy, I compute the mean absolute error for each of the methods. I repeat this experiment for the 19 seasons 2000 through 2018. (I purposely didn’t consider predicting rates for the 2019 season since these methods don’t perform well for that extreme home run hitting season.)

Results

Below I display boxplots of the mean prediction errors (for the 19 seasons) for each of the five prediction methods. Each boxplot graphs the 19 mean prediction errors across seasons for a particular method. The estimate that tends to have the smallest (best) prediction errors is Mix.50, followed closely by Mix.80, then Old, Mix.20, and Current.

Here’s a different graphical display of the results to emphasize the comparison of the accuracy of the prediction methods for each season. Each panel uses a line to connect the mean prediction errors for each season. This graph allows for easy comparison of the methods and also shows the sizes of the prediction errors depend on the season.

We can also evaluate prediction methods by comparing them pairwise across seasons. In the 19 seasons, the Mix.50 method was better (smaller mean prediction error) than the Mix.80 method for 11 seasons, better than Old for 16 seasons, and better than Mix.20 and Current for all 19 seasons. The main message from the two graphs is that the Mix.50 and Mix.80 methods provide the best predictions, and the Current method gives the worst predictions.

Takeaways

  • Tom Tango is right. To make an accurate prediction at Aaron Judge’s future home run count, one needs to combine his current season production and his historical home run records. The work here suggest that in constructing a prior for Judge’s 2022 home run probability one should probably give relatively high weight to his historical home run data.
  • Modified prior in Shiny app. I have modified my default prior on my Aaron Judge home run prediction app so that the quartiles of my prior on p are 0.054 and 0.066. (The median of this prior is close to Judge’s historical home run rate in his previous seasons.) These quartiles are matched with a Beta(43, 667) prior, so the total weight in my prior is 43 + 667 = 710 observations (compared with his current 631 PA). The predictions are based on a posterior that roughly equally weights the historical data and the 2022 data. Of course, the user can modify this default choice of prior quartiles in the app to conform to one’s beliefs about the location of p.
  • Season weights. In my prediction experiment, I computed a pooled home run rate for each player in the last four seasons. Actually it would be better to put weights on the most recent seasons. If the current season is given a 100% weight, then Tom Tango (personal communication) suggests weights of 70%, 50%, 35% and 25% for the four previous seasons.
  • Adjust predictions for current season. When I repeated this prediction exercise for the 2019 season, my prediction errors were unusually high. The problem is that these methods didn’t adjust for the extreme 2019 season home run production that was unusually favorable for home runs. Tom suggests adding 200 PA of league average to the estimate to make this current season adjustment
  • R Code? I wrote a single function prediction_work() that runs this prediction experiment available on my Github Gist site. The main inputs to this function are the season of interest, the historical seasons, and the Retrosheet play-by-play dataset for the season of interest. The output is a data frame containing the mean prediction error for the five methods. I ran this 19 times for the 19 seasons, collected the results in a single data frame, and graphed the prediction errors using ggplot2.