Giancarlo Stanton is Hot
One of the big baseball stories this season is Giancarlo Stanton — specifically he appears to be “on fire” with respect to hitting home runs. That raises the question — will he break the “honest” home run record of 61 home runs hit by Roger Maris in 1961?
I have some R packages that make it easy to visualize the streakiness of Stanton’s home run hitting and also provide reasonable predictions at the number of home runs that Stanton will hit in the remainder of the season. I’ll show you some graphs — all of the code for producing this work is shown on my Github Gist site.
Visualizing the Hotness
A basic graph of the streaky pattern of Stanton’s home run hitting is a so-called rug plot where one graphs a vertical line at the PA number where he hits a home run. As you see, Stanton was not hitting home runs in the first half of the season (there were a couple of notable gaps), but he has picked up noticeably in the 2nd half and there is quite a few home runs hit in recent games.
Another way to visualize the streakiness is by a moving average plot. Here I’m graphing the proportion of home runs hit in moving windows of 30 PA’s wide. Overall he is hitting home runs in about 8% of his PA’s but that percentage is approaching 20% in recent games. Also this shows Stanton’s two home run slumps in the first half of the season.
How Many Home Runs Will Stanton Hit?
I have illustrated my prediction method before. First, I estimate the true home run rates of all hitters using a multilevel model. Then I simulate draws of the posterior predictive distribution of the future number of home runs hit by Stanton in the remainder of the season. I estimate the future number of at-bats based on the future number of games and the number of at-bats that he has had this season. One sees that my best guess of the future number of HR’s is about 12 (total of 56 HR), but there is a reasonable chance that he would exceed Maris’ mark.
To do this work, I use the following packages:
baseballr — this is Bill Petti’s package for scraping the Statcast data
BayesTestStreak — this is my package for working with binary sequence data and looking for streaky patterns
BApredict — this is my package to collect current hitting data from SI and implement the prediction method
TeachBayes – this is a new package of mine with graphics to illustrate Bayesian thinking. I use the bar_plot function which produces the last graph