Predictive Checking of a Streaky Model

Introduction

This is part 3 of a series of posts on Extreme Ofers and Predictive Checking. In part 1 of this series, I explored some extreme ofer (streaks of consecutive outs) values in recent MLB history. In part 2 of this series, I tried to interpret the meaning of a “0 for 33” slump by introducing a coin-flipping model and illustrating how Bayesian predictive checking indicates that this extreme slump is unusual for a coin-flipping model. In this last post, I’ll describe an attractive alternative streaky model for baseball hitting and show how predictive checking can be used for this model.

A Streaky Model for Hitting

It has been established that the probability of a hit on a single AB for a particular hitter likely changes during a season. There is a special Markov switching model that describes how these hit probabilities change. Let’s suppose that during a game, this particular hitter is either “hot” with a hit probability of pH or either “cold” with a hit probability of pC. Moreover, if this hitter is hot (or cold) for a particular game, he is likely, say with probability 0.9, of remaining in the same state (and with probability 0.1 of switching to the other state) for the following game. This describes a discrete Markov chain with states {hot, cold} and probability transition matrix

Once we’ve decided if the player is hot or cold for a particular game, then the AB outcomes are independent Bernoulli outcomes with the corresponding hit probability (pH or pC).

Priors

To complete this model, we need to assign priors to the hot and cold hitting probabilities pH and pC. If we think that overall, this batter is a .250 hitter, then I suppose one might assign pH (the hot probability) a beta prior centered some larger value, say .400, and pC (the cold probability) a beta prior centered about a small value like .150.

Predictive Checking

Once we have defined this Markov switching model (including priors), then it is easy to implement the predictive checking method described in part 2 of these posts. Here are the steps:

  • Simulate replicated data from the model. You first simulate values of the hot and cold probabilities from the beta priors and use the Markov chain to simulate hit probabilities for all games in the season. Then you simulate hit/out data using independent Bernoulli distributions.
  • Compute a checking function from the simulated data. One checking function would be the maximum length of an ofer.
  • Repeat the first two steps many times, collecting values of the checking function. If we use the maximum ofer as the checking function, this represents the predictive distribution of the maximum length of an ofer.

Now you compare the observed value of the checking function with this predictive distribution. If the observed value is in the middle of this distribution, then the model is predicting values similar to what was observed — the model is performing well. Otherwise, if the observed value is in the tail of the predictive distribution, then there is some issue with the model — it is not predicting what you observed.

Illustrating Using a Shiny App

As the reader might suspect, I will illustrate this predictive approach by means of a Shiny app.

One inputs the name of a 2019 player — here I chose Rhys Hoskins. The model assumes that Hoskins for a particular game is either a hot hitter with a success probability pH or a cold hitter with success probability pC. We assume Hoskins moves between hot and cold states in games by means of a Markov chain with staying probability rho that we are setting to 0.9.

Note the priors that I am placing on the hot and cold probabilities. When Hoskins is hot, I am assuming (with probability .9) that pH is between .405 and .455, and when he is cold, pC is between .12 and .17 with probability .9.

The app implements this predictive simulation approach. It simulates value of pH and pC, simulates hit probabilities for all games, simulates hit/out data for all at-bats, and finds the length of the longest ofer.

Here is an illustration how this simulation works. The simulated values of the two hitting probabilities (from my priors) are pH = 0.453 and pC = 0.148. We simulate game hitting probabilities from a Markov chain — here is a graph of the hitting probabilities over the 158 games that Hoskins played during the 2019 season. We see the strong streaky behavior in the sequence of probabilities.

Using these probabilities, we simulate hitting outcomes for Hoskins 570 at-bats during this season. Here are the simulated values (1 corresponds to a hit, 0 to an out).

  [1] 1 0 0 1 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1
 [30] 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 0
 [59] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0
 [88] 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1
[117] 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
[146] 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
[175] 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 0 1
[204] 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
[233] 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0
[262] 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0
[291] 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 1 1
[320] 0 1 1 1 1 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 1 1
[349] 1 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 1 0 0 1 0 1
[378] 1 0 1 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0
[407] 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
[436] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0
[465] 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
[494] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0
[523] 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0
[552] 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0

From this simulated hitting data, we find all ofers — here the maximum ofer length is equal to 19.

A snapshot of my Shiny app follows. The histogram shows the predictive distribution of the max ofer length over repeated simulations. In 2019, Hoskins’ longest ofer length was 26 which is represented by the vertical bar. Here the tail probability, the chance that the predictive distribution is at least as large as 26 is 0.396. The takeaway is that Hoskins’ 0 for 26 is pretty consistent with predictions from this Markov switching model. In other words, this streaky model can predict long ofers.

By using this app, you can experiment with different players or choices of the priors on the hot and cold probabilities. By the way, if you place the same prior on pH and pC, this is similar to assuming that the player has only a single hitting probability during the season.

Closing Comments

  • Ease of Predictive Checking. I thought this was a neat example since it is so easy to implement. It is straightforward to simulate parameters from the streaky model, and then simulate hitting data given the parameters.
  • Fitting the Markov Switching Model. A different problem is to fit this streaky model to a sequence of hit/out data for a player in a season. This is more challenging since the likelihood function is more complicated, but there are attractive fitting MCMC algorithms available.
  • Other Uses of this Model? Rob Arthur and Greg Matthews applied this same Markov switching model in a 538.com article to look at sequences of fastball speeds of MLB pitchers. They use this model fit to claim that particular pitchers are streaky. I was not enthused with this particular application. In this post, I explain my objections and gave Greg the opportunity to respond. Rob and I also appeared in a Baseball Prospectus podcast on the 538 article.
  • Other Streaky Models? This Markov switching model is just one way to think about streaky ability. Perhaps it would be more realistic to think that there are three or more possible states, for example.
  • R Code? This Shiny app is included in my ShinyBaseball package — the package contains the Shiny code and also the associated Retrosheet 2019 batting data.