A Shiny App to Explore Home Run Rates
As many baseball fans know, home run hitting hit a peak during the 2019 season when 1.39 (on average) home runs were hit by each team per game. Home run hitting in the 2021 season dropped to 1.22 home runs per team/game, and dropped again in 2022 to 1.07 HR per team/game. (I am reading from this Baseball Reference page.) What is going to happen in the current 2023 season?
There are a number of variables that influence home run hitting but I think one of the most significant is the ball construction. The ball was relatively lively in the 2019 season with small drag coefficients (large carry). In contrast, the ball was pretty dead in 2022 with high drag coefficients (small carry).
I have been collecting home run data (launch variables and occurence of home runs for all balls in play) daily during the current 2023 season. I wanted to construct an easy-to-use Shiny app that would allow one explore home run rates quickly over the (launch variable, exit velocity) space. In addition, to give these home run rates some meaning, I wanted to compare these rates with predicted rates from the carry properties of baseballs from previous seasons.
Here’s a description of a Shiny app
HomeRunLaunchVariables() which is currently live at https://bayesball.shinyapps.io/HomeRunLaunchVariables/. Using this app, we’ll see how April home run hitting in 2023 compares to the previous three full seasons.
Using the App
When you launch this app, you see the following screen. You choose a range of dates of interest — here I choose the April 2023 dates through the games on Friday night (April 28, 2023). (You can choose any dates in the Statcast era from 2015 through 2023.). By using the sliders, you choose an interval of values of launch angle and exit velocity of interest. Here I choose launch angle values between 14 and 45 degrees and exit velocity values between 94 and 115 mph. The app displays a scatterplot of the (launch angle, exit velocity) pairs for all balls in play during the time period. The points are colored by HR (yes or no).
I am interested in the home run rate, the fraction of the in-play home runs hit in a region of the launch variable space. To brush the scatterplot, I move the cursor over the scatterplot and select a rectangular region. In this example I select the region where the launch angle is between 22 and 34 degrees and the exit velocity is in (98, 109.6) mph. The Shiny output tells me that during this April period, there were 1196 balls in play in this region with 596 home runs for a HR rate of 596 / 1196 = 0.498.
Compare with Previous Seasons
In the “Select Month” input section, I select the month “April”. This indicates that I will be comparing the current HR rate with rates predicted in April in each of the previous three complete seasons (2019, 2021, 2022). (Due to the temperature impact on home run hitting, it is fair to compare the current home run rates with predicted rates from the same month of the season.). You can select other months to see the impact of temperature on the home run rates.
The comparisons are displayed at the bottom of the display (also see below). I observe a HR rate of 0.498 in this launch variable region in 2023. I have previously fit models predicting the HR probability from the launch variables from April data from the 2019, 2021 and 2022 seasons. Using these fitted models, I predict the rate of home runs in April 2023. The Z Score is the Pearson residual comparing the observed and predicted rates — a values of the Z Score outside of (-2, 2) indicates a significant difference between observed and predicted.
Month/Season Observed Predicted Z_Score 1 April 2019 0.498 0.597 -4.008 2 April 2021 0.498 0.495 0.156 3 April 2022 0.498 0.426 3.853
What do we see in this comparison?
- The 2019 model (that assumes the use of the 2019 ball) predicts a much higher HR rate of 0.597 with a Z score of -4 — clearly we’re not using the lively 2019 ball in the 2023 season.
- In contrast, the 2022 model (using a 2022 ball) predicts a much lower HR rate of 0.426 with a Z score of 3.85 — we’re not currently using the “dead” 2022 ball.
- Interestingly, the 2021 model predicts a similar HR rate of 0.495 (Z score close to 0) — this indicates that the carry properties of the 2023 ball are similar to the 2021 ball.
This Shiny app is included in the ShinyBaseball package. The app is included in this folder on my Github repository. Several comments:
app.Rfile contains the Shiny app. It is a relatively short script — the
data.work()function in the script collects the data from my Github site and the
make_plot()function constructs the graph. Since this is a short script, it might serve as a useful example for those who are learning about Shiny.
allfits.Rdatafile is a Rdata file containing a list of the model fits. I have a total of 18 fits– I previously fit the generalized additive model predicting the probability of a home run as a function of the launch variables for each of the 6 months for each of the 2019, 2021 and 2022 seasons.
To run this app on one’s laptop, you put the
allfits.Rdata files in a separate folder. To launch the app, open the
app.R file and press the Run App button in RStudio.
I have an additional file
fit_gam_models.R in the folder. This is a R function that I previously ran to collect the model fits. You don’t need to execute this function, but is included so one can see the syntax for the GAM model fits.
Try it Out
- The live version of the Shiny app https://bayesball.shinyapps.io/HomeRunLaunchVariables/ will be active for the remainder of the 2023 season.
- Using the Shiny app with the April data, it appears that the home run rates in 2023 are similar to that in the 2021 season. At this point of time, I would predict a similar total home run count in 2023 as in 2021. I will continue to collect Statcast data as the current season progresses.
- This app is helpful for exploring temperature effects by changing the prediction month.
- By exploring the home run rates for and comparing these rates with predictions from model fits for future months, we can see if the 2023 rates remain similar to the 2021 rates.