# Who Hit Best Against Roger Clemens?

### Introduction

A few years ago, I wrote a post exploring batter-pitcher matchups, how batters perform against a specific pitcher and how pitchers perform against a specific batter. As an example in my earlier post, I displayed a plot showing the batting averages of all hitters who faced Jamie Moyer during his career. This plot showed the large variability (or as Bill James would say, dense fog) in batting measures for small samples. I illustrated the use of a multilevel (random effects) model to smooth these raw batting averages. These smoothed measures are straightforward to compute and one can use them to make some meaningful comparisons of hitters.

Since Tom Tango doesn’t want to talk about batting averages, I thought I’d generalize the earlier post by considering a better measure of batting performance such as wOBA. Also, I’ll use this post to illustrate the use of a Shiny app to display the smoothed estimates of wOBA where one selects an opposing single pitcher or single batter of interest. Using this smoothed estimates, one can actually answer the question: Who hit best and worst against Roger Clemens?

### Raw wOBA Values Are Variable

Here is an illustration of the issue. Below I am graphing the wOBA measure against the number of plate appearances (PA) for all batters who faced Roger Clemens during his career. Note the high variability for small PA values — some hitters had wOBA values of 0 and 1.4 against Clemens. But these values don’t reflect their wOBA abilities against Clemens — it is desirable to smooth or adjust these extreme wOBA values towards an average to get better predictions of future performance. We certainly can’t say that the hitter with a 1.4 wOBA on a few PA is the best hitter against Clemens.

### Multilevel Estimates of wOBA

To compute the wOBA measure, one assigns a weight to each of the PA outcomes and computes the mean of these weights. A hitter’s wOBA is basically a mean weight that is assumed to be normally distributed with mean M and standard deviation $\sigma/ \sqrt{PA}$, where M is his “true” wOBA value and $\sigma$ is a measure of the spread of the wOBA weights. There were a total of 1253 players who batted against Clemens during his career, so we are interested in estimating 1253 true wOBAs that we call $M_1, ...M_{1253}$ . A multilevel model assumes that these true wOBAs are normally distributed with mean mu and standard deviation tau. The model is completed by assigning weakly informative priors to mu, tau and $\sigma$.

I use a Laplace-type approximation (available in the LearnBayes package) to quickly fit this multilevel model. This fit provides estimates of mu, tau and $\sigma$. Then we can estimate the true wOBA for any hitter who faced Clemens. Without getting into technical formula, this estimate has the general form

(observed wOBA) x weight + (average wOBA) x (1 – weight),

where weight is a fraction between 0 and 1. Basically the estimate at a player’s true wOBA is a weighted average of the observed wOBA and the overall wOBA against Clemens. Here is a graph comparing the raw and multilevel estimates of wOBA — we see the strong shrinkage towards the average especially for hitters with small PA values.

### The Shiny App

I wrote a Shiny app wOBA_Matchups to explore the smoothed wOBAs.

• First decide on the Matchup Type — is one interested in matchups against a specific batter or matchups against a specific pitcher? I chose Pitcher in the example below.
• Next Select a Pitcher from the dropdown menu containing the leading pitchers (in terms of PAs) during the 1960-2021 era. I chose Roger Clemens from this list.
• Decide on the Plot Type — here I chose Multilevel. (If one chooses Comparison as the Plot Type you will see the plot above comparing the two wOBA estimates.)

We see a scatterplot of the wOBA estimates against the plate appearances. The table on the left shows the multilevel estimates — here mu est = 0.274 and tau est = 0.054. The estimate of mu represents an average wOBA against Clemens and the estimate of tau controls the size of the adjustment of the observed wOBA towards the average wOBA.

Several takeaways from looking at this graph:

• Note that for small PA the observed wOBA values are adjusted strongly towards the average value of 0.274. This makes sense — we don’t have much information about these batters with small PA values and so our prediction is close to the average wOBA.
• In contrast, the adjustment of the observed wOBAs towards the mean is smaller for batters with large PAs against Clemens.
• One compares players by comparing their wOBA estimates. Who batted best against Clemens? The app allows one to use a brushing rectangle to identify interesting points and measurements about the selected players are displayed at the bottom. In the screenshot, using a blue rectangle I have identified six players (Ken Griffey, Rafael Palmeiro, Jim Thome, Alan Trammell, Lou Whitaker) with large wOBA estimates and high number of PAs.
• One player does stand out as best — the HOF player Jim Thome had an adjusted wOBA of 0.412 against Clemens which is 40 points higher than any other hitter. (By the way, Baseball Roundtable in its “Who’s Your Daddy” series identified Thome as a hitter unusually successful against Clemens.)
• It is interesting that the players with large PAs tended to do better than average against Clemens. (I guess it is reasonable to batters with more experience against Clemens would hit better than average.)

### Matchups Against a Specific Batter

Using this app, we can instead choose matchups against Batter and see the wOBA performances of all pitchers against a batter of interest. For example, here is a graph of the wOBA estimates of the 1362 pitchers who faced Derek Jeter during his career. There is one noteworthy low pitcher performance — when we select that one point, we see that Jeter struggled against the HOF pitcher Roy Halladay. Jeter’s observed and estimated wOBA values against Halladay were 0.257 and 0.318, respectively.