Over the years I have illustrated the use of several R packages to facilitate the collection and exploration of baseball data. Several notable ones are the Lahman package for obtaining season-to-season data for all of MLB’s history, and the baseballr package for scraping other data sources such as the Statcast data from Baseball Savant. Due to Tom Tango’s tweet, I recently became aware of a new package sportyR written by Ross Drucker that uses
ggplot2 syntax to draw playing surfaces for a number of sports including football, basketball, soccer and baseball. I have already discussed constructing spray charts in several older posts. Here I will illustrate using the
sportyR package to enhance these baseball spray charts.
sportyR package is very easy to use. Once you have installed the package from CRAN, to construct a regulation MLB field, the following script loads the package and constructs the MLB playing surface.
library(sportyR) geom_baseball(league = "MLB")
Before one can construct a spray chart, one needs to understand the coordinate system used by this package. The unit is feet and the home plate location is the origin (0, 0). I’ve labeled the coordinates for the four bases below. For example, the coordinates for 2nd base are (0, 126) which indicates that 2nd base is 126 feet from home plate.
Plotting Statcast Batted Ball Data
I’d like to use this playing field as a background for a spray chart of a sample of batted balls. I’ve talked about constructing spray charts using R in several posts — this post provides an introductory discussion and this post describes the construction of an improved spray chart where one shows the “pull” side.
For balls put into play, Statcast provides the location variables
hc_y, but some reexpression is needed:
mutate(location_x = hc_x - 125.42, location_y = 198.27 - hc_y)
This reexpression flips the points around and make the origin home plate. These (
location_x, location_y) measurements do not correspond to feet, so an additional scaling value is needed so that the units of the values correspond to feet. After some trial and error, the scaling value of 2.5 seems to work, providing reasonable-looking spray charts.
mutate(location_x = 2.5 * (hc_x - 125.42), location_y = 2.5 * (198.27 - hc_y))
Once one has figured out the coordinate system, then it is straightforward to plot points on this playing field background. The
geom_baseball() function creates a
ggplot2 object and one can add other plot or textual layers by use of the addition function in
ggplot2. As an example, I collected a data frame
ff containing information on the ground balls hit by Freddie Freeman for the 2019 season . The input variables are the batted ball location variables
location_x, location_y and a character variable
H indicating if the BIP resulted in a hit or out. Since we are plotting points on a dark background, one needs to choose plotting colors that are easy to see. By use of the
scale_colour_manual() function I decide on letting “yellow” correspond to out and “red” correspond to a hit. I use the
ggtitle() to add a descriptive title.
The complete code for constructing this spray chart is shown below. Freddie Freeman is a left-handed hitter and we see from the graph that most of his ground balls are hit to the pull side. I would think that most teams would employ an defensive infield shift when Freeman is at-bat. (Checking with Statcast, I found that 68% of these Freeman ground balls during the 2019 season were fit against an infield shift, 15% were hit against a “strategic” shift, and only 17% were hit against a standard infield fielding alignment.)
geom_baseball(league = "MLB") + geom_point(data = ff, aes(location_x, location_y, color = H)) + scale_colour_manual(values = c("yellow", "red")) + ggtitle("Freddie Freeman Ground Balls - 2019")
sportyRpackage provides visuals of the playing surfaces for many sports such as basketball, football, soccer and hockey. Location data is either available or will shortly be available for these other sports and so these backgrounds will be useful for constructing location graphs.
- I have several Shiny apps, functions
SprayCompare(), for constructing spray charts in my ShinyBaseball package. I have already revised this apps to use the playing fielding visual from the
sportyRpackage. For example, here’s a snapshot of the use of the
SprayCompare()function to compare the fly ball locations of Mike Trout and Rhys Hoskins for the 2019 season. It appears that Trout is more likely than Hoskins to hit fly balls to the opposite field.