In my last post, we found that the time of a baseball game is strongly related to the number of pitches, and each pitch adds, on average, 36 seconds to the length of a baseball game. Here we use PitchFX data to get a better understanding about the times between pitches in a single game. There is a nice package
pitchRx , authored by Carson Sievert, that allows one to easily download PitchFX data and explore data from all pitches.
Here we load the package and use the
scrapeFX function to download the pitches for all games played on September 5, 2013. The
plyr::join function puts all of the pitch data in a single data frame
library(pitchRx) dat <- scrapeFX(start = "2013-09-05", end = "2013-09-05") pitches <- plyr::join(dat$pitch, dat$atbat, by = c("num", "url"), type = "inner")
The pitchFX system records the time of each pitch which is stored in the variable
sv_id . Using the
substr function, we create a new variable
time equal to the number of seconds past midnight.
pitches$hours <- as.numeric(substr(pitches$sv_id, 8, 9)) pitches$minutes <- as.numeric(substr(pitches$sv_id, 10, 11)) pitches$seconds <- as.numeric(substr(pitches$sv_id, 12, 13)) pitches$time <- with(pitches, 3600 * hours + 60 * minutes + seconds)
Let’s look at the pitch times of the game played between Arizona and San Francisco on September 5, 2013. (See the box score for this game on Baseball-Reference.) By extracting a portion of the
url variable, we create a new variable
game.id and use the
subset function to extract the pitches for this particular game.
pitches$game.id <- substr(pitches$url, 66, 95) pitches1 <- subset(pitches, game.id=="gid_2013_09_05_arimlb_sfnmlb_1") pitches1 <- pitches1[order(pitches1$time), ]
Since we are interested in the times between pitches, a new data frame
time.data is created containing three variables:
Time , the time between consecutive pitches,
Index , the number of the pitch, and the
Inning when the pitch occurred.
time.data <- data.frame(Time=diff(pitches1$time), Index=1:(length(pitches1$time) - 1), Inning=pitches1$inning[-1])
ggplot2 package is used to graph the time between pitch against the pitch number. (We give many illustrations of the
ggplot2 package in our book.) In the graph, the plotting symbol is the inning number, and we add horizontal lines at 1, 2, and 3 minutes to make it easier to read the vertical scale.
library(ggplot2) ggplot(time.data, aes(Index, Time, label=Inning)) + geom_text(size=6, color="blue") + geom_hline(yintercept=60) + geom_hline(yintercept=120) + geom_hline(yintercept=180) + geom_text(data = NULL, x = 25, y = 65, label = "1 MINUTE", size=8) + geom_text(data = NULL, x = 25, y = 125, label = "2 MINUTES", size=8) + geom_text(data = NULL, x = 25, y = 185, label = "3 MINUTES", size=8) + labs(title = "Times Between Pitches in a Baseball Game") + theme(plot.title = element_text(size = rel(2))) + theme(axis.title = element_text(size = rel(2))) + theme(axis.text = element_text(size = rel(2))) + ylab("Time (Seconds)")
What do we learn from this graph?
- In a typical inning, the time between pitches is between 15 to 30 seconds.
- It is pretty common for the time between pitches to fall between 30 and 60 seconds. This could be due to balls in play, a pickoff move, time outs, and other factors. It would be interesting to relate the times with the actual plays as recorded in Baseball-Reference.
- There are a number of significant breaks, between 2 1/2 and 3 1/2 minutes. Many of these are simply the breaks between half-innings — for example, the one 1 symbol, the two 2′s, and the two 3′s are just the inning breaks. Some of the long breaks that one sees towards the end of the game likely correspond to pitching changes.
- It is pretty clear that the game slows down towards the end, judging by the large number of long breaks in the 8th and 9th innings.
This is an illustration of the time breakdown for a typical MLB in 2013 which lasted 3 hours and 11 minutes. By looking at this time data over many games, I think one would get a better understanding about the time patterns of long games and that might help MLB devise ways to make the games shorter.