With all the excitement surrounding playoff baseball this time of year, lots of people are talking about our favorite game. If your Twitter feed is like mine, then you’re seeing a lot of talk about what is going on in these games, but also a lot of criticism of the narratives promulgated by the mainstream media. Today, I’d like to explore one of the most popular narratives that comes up during playoff time: the notion that “good pitching beats good hitting in the playoffs.”

This argument was particularly germane to this year’s American League Championship Series, which pitted the offensively-challenged Kansas City Royals against the slugging Baltimore Orioles. As has been noted by many, the Royals ranked last in the AL in home runs this year, while the Orioles ranked first. Even by more sabermetric standards, the Orioles (4th in OPS+) were clearly a superior offensive team to the Royals (last in OPS+).

But of course, the real story is a bit more mixed, since there are ways to score runs that aren’t captured by home runs and OPS. The Royals led the AL in stolen bases, while the Orioles ranked last. And so the narrative of a “study in contrast” was easy to latch onto, but the bottom line is that the Orioles ranked 6th in terms of runs per game, while the Royals ranked 9th. Both teams had good pitching and defense, ranking 3rd and 4th in terms of runs allowed per game.

In any case, I’m going to choose to interpret the phrase: “good pitching beats good hitting in the playoffs” to mean that “In the playoffs, teams that excel at run prevention during the regular season will fare better than teams that excel at run production during the regular season.” In this exercise we will use logistic regression to address the veracity of this claim.

#### Normalization

We can get all the data we need from the venerable `Lahman`

database, packaged for R. I’ll be using functionality included in the `mosaic`

package – most notably `dplyr`

, Hadley Wickham‘s latest improvement to the R data wrangling diaspora.

require(Lahman) require(mosaic)

Baseball expanded in 1969, and for the first team began to include multiple playoff rounds, so we’re going to restrict our study to seasons from 1969 until 2013, and we’re also going to exclude the 1994 season, during which there were no playoffs.

ds = filter(Teams, yearID >= 1969 & yearID != 1994)

Next, we want to have a measurement of each team’s offensive and defensive prowess. Runs are the bottom line in baseball, and so we can just focus on **runs scored** and **runs allowed**. However, we should contextualize these numbers by league and season, since we know that the run scoring environment has changed dramatically over time. Since we know that runs are approximately normally distributed, let’s do this by computing a z-score for teach team’s runs scored and runs allowed during the regular season. To do this, we need to know the mean and standard deviation of runs scored and allowed in each league and each season. `dplyr`

‘s `group_by`

construction makes this easy.

lg.avg = summarise(group_by(Teams, yearID, lgID), RS.bar = mean(R), RA.bar = mean(RA), RS.sigma = sd(R), RA.sigma = sd(RA)) head(lg.avg)

## Source: local data frame [6 x 6] ## Groups: yearID ## ## yearID lgID RS.bar RA.bar RS.sigma RA.sigma ## 1 1871 NA 295.4444 295.4444 80.95541 41.20107 ## 2 1872 NA 308.1818 308.1818 203.28247 111.47181 ## 3 1873 NA 397.7778 397.7778 235.12379 115.94048 ## 4 1874 NA 433.6250 433.6250 151.90781 54.11083 ## 5 1875 NA 325.6923 325.6923 258.17448 117.45381 ## 6 1876 NL 383.2500 383.2500 128.02762 132.31105

Notice that the mean `RS`

and `RA`

are the necessarily the same for all years before interleague play.

favstats(yearID ~ RS.bar == RA.bar, data=lg.avg)

## .group min Q1 median Q3 max mean sd n missing ## 1 FALSE 1997 2001.50 2005.5 2009.25 2013 2005.250 5.02253 32 0 ## 2 TRUE 1871 1909.25 1938.0 1967.75 2001 1937.592 35.24307 238 0

However, by coincidence the number of `RS`

and `RA`

in each league in 2001 were the same!

filter(lg.avg, yearID == 2001)

## Source: local data frame [2 x 6] ## Groups: yearID ## ## yearID lgID RS.bar RA.bar RS.sigma RA.sigma ## 1 2001 AL 786.6429 786.6429 84.48086 94.68783 ## 2 2001 NL 761.6250 761.6250 72.18945 73.46825

In order to compute the z-scores, we’ll have to `merge`

the league averages to our team data.

df = merge(x=ds, y=lg.avg)

We can now compute the z-scores themselves, and let’s compute the regular season winning percentage as well, and create an indicator variable for whether each team made the playoffs.

df <- mutate(df, RS.z = (R - RS.bar)/RS.sigma, RA.z = (RA - RA.bar)/RA.sigma) df <- mutate(df, WPct = W / (W + L)) df <- mutate(df, madePlayoffs = ifelse(LgWin == "Y" | DivWin == "Y" | WCWin == "Y", TRUE, FALSE))

#### Exploratory Data Analysis

We know from the success of Bill James’ Pythagorean model for expected winning percentage that you can do a pretty decent job of predicting winning percentage using just runs scored and runs allowed. The following scatterplot highlights the teams that made the playoffs, in terms of their normalized `RS`

and `RA`

.

require(latticeExtra) p1 <- xyplot(RS.z ~ RA.z, groups = madePlayoffs, data=df, pch=19, alpha=0.5, xlab="Normalized Runs Allowed (fewer is better)", ylab="Normalized Runs Scored (more is better)") p1 <- p1 + layer(panel.abline(v=0, col="darkgray")) p1 <- p1 + layer(panel.abline(h=0, col="darkgray")) p1

Note that only one team has made the playoffs with below average `RS`

**and** `RA`

. The 1987 Minnesota Twins, who incidentally won the World Series. Go figure.

filter(df, RS.z < 0 & RA.z > 0 & madePlayoffs == TRUE)

## yearID lgID teamID franchID divID Rank G Ghome W L DivWin WCWin ## 1 1987 AL MIN MIN W 1 162 81 85 77 Y <NA> ## LgWin WSWin R AB H X2B X3B HR BB SO SB CS HBP SF RA ER ERA ## 1 Y Y 786 5441 1422 258 35 196 523 898 113 65 NA NA 806 734 4.63 ## CG SHO SV IPouts HA HRA BBA SOA E DP FP name ## 1 16 4 39 4281 1465 210 564 990 98 147 0.98 Minnesota Twins ## park attendance BPF PPF teamIDBR teamIDlahman45 ## 1 Hubert H Humphrey Metrodome 2081976 103 103 MIN MIN ## teamIDretro RS.bar RA.bar RS.sigma RA.sigma RS.z RA.z ## 1 MIN 793.7143 793.7143 54.09028 76.35386 -0.1426187 0.160905 ## WPct madePlayoffs ## 1 0.5246914 TRUE

#### Modeling

Anyway, it does appear from the plot that there are more playoff teams with better-than-average defense and below average offenses than the converse. But this does not really address our question. We want to know if *among playoff teams*, did those with good pitching fare better than those with good hitting?

One way to do this is to fit a logistic regression model. Since there has always been a league championship winner, and I hope we can all agree that winning a pennant, even if you do not win the World Series, represents postseason success (ahem…sorry, Oakland), we can use that as our binary response variable. The explanatory variables for the model will be the normalized runs scored and runs allowed.

mod <- glm(LgWin == "Y" ~ RS.z + I(-RA.z), family=binomial, data=filter(df, madePlayoffs == TRUE)) # summary(mod) exp(coef(mod))

## (Intercept) RS.z I(-RA.z) ## 0.1501652 1.7348711 2.3617084

exp(confint(mod))

## 2.5 % 97.5 % ## (Intercept) 0.07260267 0.2941838 ## RS.z 1.18067222 2.5923080 ## I(-RA.z) 1.49411083 3.8474414

The odds ratios shown above are revealing. This model suggests that a playoff team with a one standard deviation greater ability to score runs multiplies its odds of winning the pennant by about 1.73, on average, after controlling for their ability to prevent runs, with a 95% confidence interval for this odds ratio of [1.18, 2.59]. Keep in mind that a one standard deviation difference in run-scoring is a pretty large amount, but in that event, the odds of winning the pennant nearly double. However, a one standard deviation decrease in runs allowed corresponds to a 2.36-fold increase in the odds of winning the pennant, on average, after controlling for their ability to scores runs, with a 95% confidence interval between [1.49, 3.85]. Thus, the returns to run prevention, in terms of winning the pennant, are significantly better than the returns to run scoring. In other words, if you had to choose between being better at defense, relative to being better on offense, you would choose the former.

Note: we haven’t controlled for winning percentage here, but I don’t think we really need to, since we already know that

`RS`

and`RA`

provide a pretty good model for`WPct`

. I ran the model with a term of`WPct`

as well, and while the coefficients are obviously different, their ratio was similar.

In the plot below, we show how the estimated probability of winning the pennant changes as a function of a team’s normalized `RS`

, for five different values of their normalized `RA`

. The thickest curve in the middle of the plot represents an average defensive team. The less thick lines above (below) that curve correspond to teams with run prevention abilities one standard deviation better (worse) than average, respectively. Similarly, the thin line show the model estimates for team that are 2 standard deviations above or below the average in terms of run prevention.

xyplot(LgWin == "Y" ~ RS.z, data=filter(df, madePlayoffs), alpha=0.5, pch=19, xlim=c(-3.5,3.5), xlab="Normalized Runs Scored", ylab="Probability of Winning the Pennant") fmod = makeFun(mod) plotFun(fmod(RS.z = x, RA.z = 2) ~ x, add=TRUE, col="darkgray", lwd=1) plotFun(fmod(RS.z = x, RA.z = 1) ~ x, add=TRUE, col="darkgray", lwd=2) plotFun(fmod(RS.z = x, RA.z = 0) ~ x, add=TRUE, col="darkgray", lwd=3) plotFun(fmod(RS.z = x, RA.z = -1) ~ x, add=TRUE, col="darkgray", lwd=2) plotFun(fmod(RS.z = x, RA.z = -2) ~ x, add=TRUE, col="darkgray", lwd=1) ladd(panel.text(0, 0.5, "Better Than Average Run Prevention", adj=0)) ladd(panel.text(0, 0.2, "Average Run Prevention", adj=0))

In the corresponding plot in terms of `RA`

, we can see how the probability of winning the pennant does not increase as rapidly.

xyplot(LgWin == "Y" ~ RA.z, data=filter(df, madePlayoffs), alpha=0.5, pch=19, xlim=c(-3.5,3.5), xlab="Normalized Runs Allowed", ylab="Probability of Winning the Pennant") plotFun(fmod(RA.z = x, RS.z = 2) ~ x, add=TRUE, col="darkgray", lwd=1) plotFun(fmod(RA.z = x, RS.z = 1) ~ x, add=TRUE, col="darkgray", lwd=2) plotFun(fmod(RA.z = x, RS.z = 0) ~ x, add=TRUE, col="darkgray", lwd=3) plotFun(fmod(RA.z = x, RS.z = -1) ~ x, add=TRUE, col="darkgray", lwd=2) plotFun(fmod(RA.z = x, RS.z = -2) ~ x, add=TRUE, col="darkgray", lwd=1) ladd(panel.text(-1.5, 0.5, "Better Than Average Run Scoring", adj=0)) ladd(panel.text(-1, 0.2, "Average Run Scoring", adj=0))

So maybe there is something to this notion of “good pitching beats good hitting in the playoffs” after all…