One of the interesting graphs used by Baseball Savant‘s Illustrator tool is a Radial Chart where one uses a polar coordinate system to display the launch angles and exit velocities of balls put into play. This appears to be a useful way to visualizing these launch variable measurements. I thought it would be a good exercise to show how radial charts can be constructed from Statcast data using the
ggplot2 package. As we did in Chapter 3 of Analyzing Baseball with R, I’ll provide step-by-step instructions for creating this graph. Then I will use the graph to see the behavior of Zach Wheeler’s balls in play for his first start in the 2021 season.
We start with a relatively small group Statcast data, say for a starting pitcher for a particular game. The only variables I need are the
launch_angle variables for this dataset.
We are plotting the (launch angle, launch speed) values on a unit circle (polar coordinates) where the launch angle is the angle in radians from the x axis, and the launch speed is the distance from the origin. Converting back to cartesian coordinates, the points we graph follow the formula
mutate(Xcoord = launch_speed / 120 * cos(launch_angle * pi / 180), Ycoord = launch_speed / 120 * sin(launch_angle * pi / 180))
Here a point on the unit circle represents a batted ball hit 120 mph. (It is rare to observe a point outside of this circle.)
Also since we want to color the points by the type of batted ball, we define a new variable
mutate(BB_Type = ifelse(launch_angle > 50, "pop up", ifelse(launch_angle > 25, "fly ball", ifelse(launch_angle > 10, "line drive", "ground ball"))))
Drawing the Background Layers
The first step in the construction of the radial chart is to draw the filled semicircles that provide the background for the plotted points. We define a vector of angles from to and define a data frame
df_new2 containing the points of the semi-circle. Two applications of the
geom_polygon() function are used to draw the semicircles. Note that I am saving this
ggplot2 code in the variable
plot1 — I’ll be adding additional layers to this code below.
(plot1 <- ggplot() + coord_equal() + geom_polygon(data = df_new2, aes(x, y), fill = "cadetblue1") + geom_polygon(data = df_new2, aes(x / 2, y / 2), fill = "cadetblue3"))
Adding Several Guide Lines
Next we add code for several guide lines that display the 45 degree and 0 degree angles on the graph.
(plot2 <- plot1 + geom_segment(aes(x = 0, y = 0, xend = cos(pi / 4), yend = sin(pi / 4)), color = "grey") + geom_segment(aes(x = 0, y = 0, xend = 1, yend = 0), color = "grey"))
Add the Data
Now we are ready to add the launch variable points layer to the graph. Note by using the
color = BB_Type argument in the
aes() function, we are coloring the points by the batted ball type
(plot3 <- plot2 + geom_point(data = sc_ip, aes(Xcoord, Ycoord, color = BB_Type), size = 4))
Adding Some Annotation Text
We next add a text layer to display some key values of the launch angle and exit velocity measurements.
(plot4 <- plot3 + annotate(geom = "text", x = 0.75, y = 0.75, label = TeX("45^o"), color = "red") + annotate(geom = "text", x = 1.05, y = 0, label = TeX("0^o"), color = "red") + annotate(geom = "text", x = 0, y = 1.05, label = TeX("90^o"), color = "red") + annotate(geom = "text", x = 0, y = -1.05, label = TeX("-90^o"), color = "red") + annotate(geom = "text", x = 0.57, y = 0.91, label = "120 mph", color = "blue") + annotate(geom = "text", x = 0.2, y = 0.45, label = "60 mph", color = "white"))
Cleaning Up and Adding an Image
To complete the graphic, we want to remove the gray background and the axes. Also, following the Baseball Savant display, I add a small graphics image of a ballplayer to the left of the rectangle. (This is facilitated using the
png package to read in the png image, and the
patchwork package to place the image on the graph.). Also I add a title to the graph.
plot4 + theme(panel.grid = element_blank(), axis.title = element_blank(), axis.text = element_blank(), axis.ticks = element_blank(), panel.background = element_blank()) + xlim(0, 1.1) + ylim(-1.1, 1.1) + ggtitle("Radial Plot of Balls in Play") + theme(plot.title = element_text(colour = "red", size = 18, hjust = 0.5, vjust = 0.8, angle = 0)) + inset_element(p = player, left = 0.3 - 0.52, bottom = 0.425, right = 0.46 - 0.42, top = 0.565)
To illustrate the use of my
radial_plot() function, I look at the launch variables for the balls in play for Zach Wheeler’s first start in the 2021 season. Wheeler throws many of his pitches low in the zone hoping to induce a good number of ground balls. Let’s see how he did in this particular game. Looking at this graph, note that all of the points are in the outer ring — this means that all batted balls were hit with launch speeds between 60 and 120 mph. I see quite a few ground balls; there were only two line drives, two fly balls, and a single pop up. Wheeler had a very strong performance in this game, only allowing one hit and striking out 10 in seven innings.
All 2021 Batted Balls
I wrote a second function that is designed to construct a radial chart for a large sample of batted balls. Here is a radial plot of the launch variables for all batted balls for the first 9 games of the 2021 season. Since I am graphing over 5000 points, I use a small plotting point (size = 0.5) and set alpha = 0.5 to make the points somewhat transparent. Also I have colored the points using the expected batting average (xBA). Note that a large portion of these batted balls have launch angles between 0 and 45 degrees and exit velocities between 60 and 120 mph. The upper-right red region in the corresponds to the home runs with high exit velocities, and the lower red region corresponds to the softer hit balls with a good launch angle.
- This example illustrates how
ggplot2can be used to create new types of graphs. Hopefully the step-by-step approach is helpful for understanding the layered approach in the ggplot2 graphics system.
- The R function
radial_plot()is available on my Github gist site. The function reads in a graphics file player.png that is stored in the same folder as the function. (The function will still work if the graphics file is missing.) The only inputs to the function is a Statcast data frame with the variables
launch_speed, and an optional title to add to the display. I encourage the reader to try it out or to improve the display in different ways.
- If you look at Baseball Savant and different blog sites, you’ll see many applications of radial charts. One could look at a specific hitter or a specific pitcher. One could color the point by the expected wOBA value of the outcome. Many of the Baseball Savant radial charts show additional regions such as the barrel region.
This is very interesting, but I am having a bit of trouble replicating it. Specifically with this part: We define a vector of angles from – \pi /2 to \pi / 2 and define a data frame df_new2 containing the points of the semi-circle.
There doesn’t seem to be an example of how what df_new2 <- ??? should be. So I get a df_new2 not found error.
Any advice? I'm new to R and learning it through your's and Max's book.
Appreciate any assistance.
Even, thanks for your interest. I’d suggest looking at the code for that particular Shiny app on Github. See https://github.com/bayesball/ShinyBaseball/blob/main/inst/shiny-examples/RadialChart/app.R
By golly, I got it to work. You have no idea how much I have learned from your book Analyzing Baseball Data with R and this blog, Jim. After 44 years of teaching (I’m in year 18 & looking to pivot into a career in data analysis) you deserve every bit of your retirement. But if you ever teach a class online, please let me know. I’ll be at the front of the line to sign up.