Shiny App to Brush Batting Averages

Introduction

I have always enjoyed reading graphical displays where one is able to interact with the graph. Reading Chapter 7 of Hadley Wickham’s Mastering Shiny text, I learned that Shiny (R’s package for developing web apps) includes the capability to write apps responding to pointer events such as clicking and brushing. Generally brushing is a dynamic graphical method where an observer can play with controls and get immediate output. These dynamic graphical methods have been popular for a number of years — here is a well-known paper by Becker and Cleveland from 1987 on brushing.

Here we illustrate this special interactivity feature in Shiny in the context of baseball. We know that a player’s batting average on balls in play depends on the location of the pitch. We start with a scatterplot of the zone locations for all balls put in play where the color of the point corresponds to the in-play outcome (Hit or Out). In this post, I describe writing a short Shiny app that allows one to display a batter’s batting average for brushed regions of the zone. Here’s a snapshot of this app where the brushed region is the light blue region in the lower-right part of the graph. The output indicates that Bryce Harper (a left-handed hitter) is pretty good (BABIP = 0.515) in hitting balls low and inside.

If you are interested, here is a filled contour graph of a smoothed version of Harper’s BABIP for the same season using the CalledStrike package. One issue with this contour graph is there is a bit of overfitting — for example, the contour graph suggests that Harper’s BABIP is over 0.500 in the upper-outside (that is, upper-left) region of the zone, but looking at the scatterplot there is little data to support this BABIP estimate.

Take the App for a Ride

Here’s a live version of my brushed batting average Shiny app for you to try. Enter the name of a specific hitter and you’ll see a scatterplot of the locations of all balls put in play. By mousing over the display, choose a rectangle and the app will display the BABIP for points in the rectangle. By dragging the rectangle across the scatterplot, you’ll see interesting variation in the BABIP values.

Basic structure of a Shiny app

This particular Shiny app is relatively short and so I can describe it in some detail. All of the code for this Shiny app is contained in a single R script file called app.R that you can find on my Github Gist site. There are two basic components in any Shiny app file.

  • The user interface function ui() sets up how the user interacts with the app and how the output will be displayed.
  • The function server() performs the computation and constructs the graphs.

I’ll describe the particulars of each component for this application.

The header

At the beginning of the file app.R, I have a few lines displayed below that load any necessary R packages and import the data needed. One always needs the shiny package and the ggplot2 and dplyr packages are needed for the graphing and data manipulation. Note that I’m reading in the Statcast balls in play data from the 2019 season from my Github site.

library(shiny)
library(ggplot2)
library(dplyr)
sc2019_ip <- read.table("https://raw.githubusercontent.com/bayesball/CalledStrike/master/data/sc2019_ip.txt", header = TRUE)

The user interface

The script below contains the entire user interface function — here is a description of the individual lines.

  • The fluidPage() function sets up the Shiny environment.
  • The theme line describes a special theme (font size, colors, etc) currently available using the bslib package. You don’t need this line if you are happy with the default appearance of the app.
  • The h2() line (standard html code) displays an app title.
  • The textInput() function allows one to enter in the player name.
  • The plotOutput() function sets up a graphics window. Note the inclusion of the brush argument which enables the brushing behavior for the scatterplot.
  • The tableOutput() function sets up a window for displaying text.
ui <- fluidPage(
      theme = bslib::bs_theme(version = 4,
                        bootswatch = "darkly"),
      h2("Brushing In-Play Batting Averages"),
      textInput("name", "Batter Name (example: Bryce Harper):", value = ""),
      plotOutput("plot", brush = "plot_brush"),
      tableOutput("data")
)

The server function

To make this portion of the script easier to read, I’ll just describe the basic structure of the server() function. (See the Github gist script for the entire function.) Notice there are two functions within server() — the function renderPlot() contains the code for constructing the graph and the function renderTable() displays a data table of variables from the brushed region. Note that the outputs of these functions are named output$plot and output$data — the names “plot” and “data” correspond to the labels used in the plotOutput() and textInput() functions in the user interface. I’ve displayed one line using brushedPoints() — this function extracts the portion of the data frame inside the brushed rectangle.

server <- function(input, output, session) {
  output$plot <- renderPlot({
    <ggplot2 code for producing the graph>
  }, res = 96)
  output$data <- renderTable({
       sc1 <- brushedPoints(filter(sc2019_ip,
                    player_name == input$name),
                    input$plot_brush)
    <code to display a data table>
  }, digits = 3, width = '75%')
}

The shinyApp() function

At the bottom of the app.R file, one adds the shinyApp() function which creates the Shiny app given the user interface and server functions.

shinyApp(ui = ui, server = server)

Running the app

Once you have the file app.R in your home directory, then one runs the Shiny app by use of the runApp() function:

runApp()

What Do We Learn About Bryce Harper’s BABIP?

I played with the app for Bryce Harper. By adjusting and moving the brushing rectangle, I get some interesting insights.

  • Overall (by selecting all points) Harper had a BABIP of 149 / 395 = 0.377.
  • Selecting a region of points low and outside, Harper was 54 / 159 = 0.340
  • Selecting a region of points low and inside, Harper was 62 / 138 = 0.449
  • I did find one region where Harper was 51 / 101 = 0.505

Overview, it is pretty clear that Harper prefers balls low and inside, and next balls high and outside.

Miscellaneous Comments

  • The goal here was to present a simple Shiny app that is potentially useful in baseball work. Once one understand the Shiny fundamentals, then I encourage the reader to use these interactive graphic features for other types of exploration. One could use this type of app to explore swing, connect, and home run tendencies.
  • Brushing is used as a selection tool in Baseball Savant. For example, in the Statcast Pitch Highlighter, one can see all of the pitches thrown to a given batter and choose a subset of pitches by a brushing tool. (Thanks to Tom Tango who showed me what was available in Baseball Savant.)
  • As I mentioned earlier, the entire Shiny app is available on my Github Gist site. The app (and the data) are also available as the function BrushInPlay() in the CalledStrike package.
  • There are many ways to use brushing in a Shiny App. For example, one can have a linked table to show the rows of the data frame that are being selected, and/or another scatterplot with new variables where the brushed points are shown with a new color. Here’s another of my Shiny creations displayed below. I start with the 2020 FanGraphs batting leader board and choose four variables to explore. Here I am exploring the relationship of SLG and OBP in one scatterplot, and walk and strikeout rates in a second scatterplot. I can brush over points in either scatterplot — the brushed cases are colored red in the other scatterplot and displayed in the table on the left. We see that the players with the six highest BB rates had varying degrees of success on the basis of their OBP and SLG values.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: