In the last post, I devised a simple model for run scoring in a halfinning. Based on the probabilities of the different PA events and runner advancement probabilities, I wrote a function to simulate the number of runs scored, and the patterns using this model seem to resemble real run scoring in baseball.
I want to use my work to address the following problem. Suppose a team scores 10 runs in a game. I would like to measure the so called “cluster luck” of this run scoring. That is, how many of these 10 runs are attributed to the fact that the team was able to cluster their onbase events? (By the way, this problem has been addressed in other ways — in particular, this post uses a measure which compares the actual runs scored with a runscreated formula.)
Here is my approach. As an example, let’s look at the June 12, 2015 game between the Red Sox and the Blue Jays where the Red Sox scored 10 runs. Using Retrosheet playbyplay data, it was straightforward to extract all of the outcomes (in sequence) of the PA’s during this game.
[1] "OUT" "BB" "2B" "1B" "OUT" "HR" "HR" "1B" "1B" "BB" "OUT" [12] "OUT" "1B" "OUT" "OUT" "OUT" "1B" "BB" "HR" "OUT" "OUT" "OUT" [23] "OUT" "OUT" "OUT" "OUT" "1B" "OUT" "OUT" "OUT" "OUT" "OUT" "OUT" [34] "OUT" "BB" "OUT" "OUT" "BB" "1B" "BB" "OUT" "OUT" "BB" "OUT" [45] "2B" "OUT"
If one tabulates these outcomes …
events 1B 2B BB HR OUT 7 2 7 3 27
we see that the Red Sox had 12 hits (7 singles, two doubles, and 3 home runs), 7 walks, and 27 outs for the nineinnings they batted in this game. Looking at the linescore, it seems that the Red Sox’s scoring exhibited some clustering as they scored 8 of their 10 runs in the 1st and 3rd innings.
Suppose there is really no true clustering in the arrangement of the 46 PA outcomes above. That would suggest that all possible arrangements of these symbols is equally likely. (Actually, not all arrangements are possible since we know the last inning must end with an OUT.)
Here’s my method.
 Using the
sample
function in R, I randomly mix up these 46 symbols. Here is one of these random permutations:[1] "BB" "OUT" "OUT" "1B" "HR" "OUT" "OUT" "OUT" "OUT" "1B" "OUT" [12] "1B" "OUT" "OUT" "OUT" "1B" "HR" "OUT" "OUT" "OUT" "BB" "1B" [23] "1B" "OUT" "2B" "1B" "OUT" "OUT" "HR" "OUT" "2B" "OUT" "BB" [34] "OUT" "OUT" "BB" "OUT" "OUT" "OUT" "BB" "OUT" "OUT" "BB" "OUT" [45] "BB" "OUT"
Based on the outs, I can partition the mixedup PA’s into nine innings.

Using my runscoring algorithm (described in the last post) for each inning, I simulate the runs scored for this mixedup game given this random arrangement of symbols. (I say “simulate”, since there is randomness in the runner advancement in my scoring algorithm.)

I repeat step 2 a large number of times, obtaining a simulated distribution of runs scored in the game if there is no true clustering. This represents the game run scoring that one would predict if these 7 singles, 2 doubles, 7 walks, 3 home runs, and 27 outs just occurred in some random fashion during the game.

I compare the actual run scoring with this simulated distribution — if the actual runs scored is “large” relative to the simulated distribution, then that would indicate some “cluster luck”.
(There is a quibble here. Since my run scoring model ignores runcontributing events such as steals, sacrifices, and wild pitches, then it is seems only fair in my comparison to also apply the same runscoring algorithm to the actual sequence of PA events. So in the graph below, the “simulated” histogram refers to the simulation assuming the “all possible arrangements are equally likely” model, and the “observed” histogram refers to model simulations based on the observed PA sequence.)
I show two histograms below for this particular Red Sox pattern of PA’s. The top one represents the runs scored using the actual sequence of PA outcomes, and the bottom histogram represents what one would predict using random arrangements of the PA symbols.
In this case, the results are a bit surprising. Using the actual PA sequence, the Red Sox would score 910 runs in this game — the number tends to understate the actual runs scored (10) a little since my runscoring model ignores stealing, sacrifices, etc. In the random arrangement model, the Red Sox would also score, on average 9 runs, although the simulated runs scored between 5 and 13. For this particular game I would say that the cluster luck effect in run scoring is pretty small.
Although this work represents a first look at this run clustering issue, it looks promising. One should be able to use this method to compare teams in their abilities to cluster onbase events to score runs. Obviously this is an important issue for a baseball manager. For example, the Phillies manager currently plays with the batting order of his hitters. Using this method, can one offer some guidance on an optimal batting order to make most efficient use of the run clustering effect?