In the last post, I described a general Bayesian methodology for learning about Daniel Murphy’s “true” home run rate and making predictions about his future home run performance. In one sense those predictions were not real in that I knew how many opportunities (AB) Murphy would have in each series.
Let’s focus here on predicting the number of home runs hit by Murphy in the World Series that starts shortly. Following my work in the last blog post, I think a reasonable prior for Murphy’s home run probability is a beta density with shape parameters 15.3 and 459.8. The prior mean is about 0.032 — this is higher than his rate of hitting home runs in the 2015 regular season, but not that much higher. (The reader is encouraged to try other priors for , especially if you believe Murphy will be HOT this week.)
Here’s a picture of my prior:
What Else is Unknown?
What makes this prediction more interesting is that we don’t know the length (number of games) of the 2105 World Series, and we also don’t know how many at-bats Murphy will get in the individual games.
But we have some knowledge about lengths of World Series’ and also some information about typical number of at-bats for Murphy in a game, and we can use this knowledge to formulate priors for the number of games and the number of AB for Murphy in this series.
Number of Games
My opinion is that the Royals and the Mets are pretty evenly matched, so it is reasonable that the winner of each game is equally likely to be either team. With this assumption and assuming independence of game outcomes, we can get a probability distribution for the length of the series. (I don’t believe this distribution will be much different with other assumptions about the teams’ relative strengths.)
Number of At-Bats
To learn about the number of AB of Murphy for a single game, we look at the number of at-bats for Murphy during the 2015 season when he was a regular. Using this data, I estimate that, in a single game, Murphy will get 3, 4, or 5 at-bats with respective probabilities .14, .69, .17.
One Predictive Simulation
To predict the number of Murphy home runs for a single World Series by simulation, we
- Simulate a number of games
- Simulate the number of at-bats for each of the games — let the total number of at-bats be
- Choose a true home run rate at random from my beta(15.3, 459.8) prior
- Finally, simulate the number of home runs from a binomial distribution with size and probability of success
We repeat this process 10,000 times, obtaining a predictive distribution for the total number of Murphy home runs get during the series.
Number of Home Runs 0 1 2 3 4 5 Probability 0.473 0.343 0.137 0.038 0.008 0.001
So I’m 95 percent confident that Daniel Murphy will hit 2 or fewer home runs during the 2015 World Series. Of course, this statement is based on my priors on Murphy’s true home run ability, the length of the series, and the number of AB in each game.
Shiny App to Experiment with Different Priors
If you have a different opinion about Murphy’s home run talent during the 2015 World Series, I’ve built a Shiny app where you can play with different beta(a, b) priors for and see the impact of these priors on the predictive distribution of home runs for Murphy. Just click here to see my Shiny app.
Also all of the R work for parts I and II of this Murphy study can be found at my gist site.