Locations of All In-Play Events in One Day of Baseball

I am currently at Datafest being hosted by Miami University.  We have two BGSU teams competing — this is crunch time as the teams are getting ready to present their work on the “big data” problem that was presented.  This is a great opportunity for our students and they get to meet data scientists in industry.

Anyway, I have had some time to R-play during the weekend and I had the following visualization task.  I was interested in showing the location of all in-play events during a single day of baseball.  Using the openWAR package, it was easy to download all of the plays for all games on April 29. The relevant variables in the data frame are

  • event – the description of the result of the plate appearance
  • our.x, our.y – the (x, y) location of the in-play event
  • stand – the side of the batter (right or left)

Using ggplot2 it was straightforward to create the following graph showing the locations. I have colored the points by the type of hit or out (for example, the red points are home runs), and I have added some guide lines showing the infield diamond, foul lines, and a representative outfield fence.

locationsSome interesting observations:

  • One sees clearly where most plays are made in baseball — this is related to the positioning of the fielders.
  • Clearly one sees the bat-handedness effect.  Most of the home runs hit by lefties are in right field and most of the righty homers are in left field.  If there are foul-outs on the infield side, they tend to be on the same side as the batter.
  • Interesting to see the locations of the doubles (the green dots) — they tend to be down the lines or in the gaps in the outfield.
  • I was wondering if I would see any “shifting” effects.  I do see some singles by lefties on the left side of the infield — are these batters trying to beat the shift?
  • It would be interesting to see how these in-play spatial patterns vary by team or how they have changed given the current emphasis on defensive shifting.

R Notes

All of the R code is available on the my gist site. Assuming you have the openWAR , ggplot2 , and devtools packages installed, you can reproduce this graph by typing the following in the R console.

library(devtools)
source_gist("423e41ed31deef57e863df33b969b8cf")

Leave a comment