Graphically compare pitchers to contemporaries
Last time we provided code to display seasonal ERA for a pitcher compared to that of his contemporaries.
In this post we will turn that code into a function, so that by simply passing the pitcher name to the function, the plot will be displayed.
While we are at it, let’s make some improvements to the code and the resulting plot:
- We allow for stats different than ERA to be chosen for the comparison
- We add a legend as suggested in the final paragraph of the previous post.
Before writing the function, let’s load the relevant data (the code is the same as one week ago).
options(stringsAsFactors=F) #set working dir setwd("your/directory/containing/Lahman/DB") #read data master = read.csv("Master.csv") pitching = read.csv("Pitching.csv")
And now, on with the function, code first, explanations later.
hofChart = function(pitcher, stat){ require(doBy) require(ggplot2) # season totals by pitcher pitching = summaryBy(ER + IPouts + SO + BB ~ playerID + yearID , data=pitching, FUN=sum, keep.names=T) # calculate stats (you can add your own too, e.g.: HR/9, FIP, ...) pitching$ERA = pitching$ER * 27 / pitching$IPouts pitching$K9 = pitching$SO * 27 / pitching$IPouts pitching$W9 = pitching$BB * 27 / pitching$IPouts # get selected pitcher's data pitID = subset(master, paste(nameFirst, nameLast)==pitcher)$playerID pitData = subset(pitching, playerID==pitID) # get contemporaries of selected pitcher (qualifying only) contemporaries = subset(pitching , yearID >= min(pitData$yearID) & yearID <= max(pitData$yearID) & IPouts >= 162*3) # compare stat with contemporaries ggplot(data=pitData, aes_string(x="factor(yearID)", y=stat)) + geom_boxplot(data=contemporaries, aes_string(x="factor(yearID)", y=stat)) + geom_point(data=contemporaries, aes_string(x="factor(yearID)", y=stat, col="'oth'", shape="'oth'", size="'oth'"), position=position_jitter(width = 0.15), alpha=.6) + geom_point(aes(col="sel", shape="sel", size="sel")) + xlab("season") + ggtitle(paste(pitcher, " vs his contemporaries (", stat, ")", sep="")) + scale_color_manual(values=c("oth"="black", "sel"="blue") , labels=c("oth"="contemporaries", "sel"=pitcher) , name="") + scale_shape_manual(values=c("oth"=1, "sel"=19) , labels=c("oth"="contemporaries", "sel"=pitcher) , name="") + scale_size_manual(values=c("oth"=2, "sel"=5) , labels=c("oth"="contemporaries", "sel"=pitcher) , name="") }
Let’s see changes and tweaks since the previous incarnation of the code.
First we have added code for computing a couple of stats other than ERA, namely strikeouts per nine (K9
) and walks per nine (W9
). Just add formulas there for other stats you want to visualize (HR per nine, FIP, …).
Then we have changed some of the aes
calls to aes_string
inside the code for building the ggplot
.
The difference between aes
and aes_string
is that the former requires expressions as arguments for the aesthetics, while the latter accepts strings. The advantage of using aes_string
inside a function is that it allows to easily pass aesthetics as arguments of the function.
Thus, in our case, we can use stat
as a function argument, to which we pass (as a character string) the pitching stat we’d like to have visualized.
Finally we have added code for generating a legend. This has been achieved by adding color
, shape
and size
aesthetics to the geom_point
calls. For more detailed explanation on this, look at the ggplot2 tips post.
Note that, by setting the same name
and the same set of values
for the three scale_..._manual
calls, a single legend is added to the plot.
Now, simply call the hofChart
function passing the pitcher
and the stat
of your choice and… voilà!
hofChart("Roger Clemens", "ERA")
hofChart("Roger Clemens", "K9")
hofChart("Roger Clemens", "W9")
The Rocket’s ERA was among the best 25% in several seasons, and he recorded a couple of exceptional seasons at the tail end of his career.
Except for his final season, he was among the elite pitchers at striking out opponents. On the other hand, despite posting good numbers in some years, he was not one of those pitchers who earned his money by avoiding walks.
His profile is definitely one of a power pitcher, a Hall-of-Fame-bound one if not for some extracurricular activities that tainted his legacy.
hofChart("Greg Maddux", "ERA")
hofChart("Greg Maddux", "K9")
hofChart("Greg Maddux", "W9")
And here’s another way to get strong credentials for a place in Cooperstown. The Mad Dog had an exceptional run of ridicolously low ERA from 1992 to 2002. But while Clemens built his success racking up strikeouts, Maddux was rarely better than average in that regard: he instead was uncanny at avoiding bases on balls year after year until his retirement.
Recent Comments