Introduction
I was very sorry to see the passing of Tim McCarver yesterday. McCarver was well known in baseball as a great defensive catcher. In addition, after his baseball playing career, he was a popular Hall of Fame broadcaster. I read from an article that McCarver called 23 World Series games and 20 All-Star games for Fox. As a Phillies fan, I remember that McCarver was the favorite catcher of the HOF pitcher Steve Carlton. Also he caught many games for the HOF pitcher Bob Gibson when they played for the Cardinals. That raises several questions of interest.
- How many games (or specifically plate appearances) did Tim McCarver catch for Steve Carlton and Bob Gibson?
- Are the Carlton-McCarver and Gibson-McCarver batteries among the most popular batteries during the McCarver era (between the 1959 and 1979 seasons)?
- What were the most popular batteries in the McCarver era?
In this post, I illustrate how one can address these McCarver questions by straightforward applications of the filter()
, group_by()
, summarize()
and arrange()
functions from the dplyr
package.
Retrosheet Data
From retrosheet.org, I collect the Retrosheet play by play data for the 1959 through 1979 seasons. Assume that the Retrosheet data is contained in the single data frame pbp.59.79
. Each row of this data frame represents an outcome of a plate appearance or a stealing, passed ball, wild pitch or balk event that changes the configurations of runners on base. By use of the filter()
function in the dplyr
package, I focus on the batting events where the BAT_EVENT_FL
variable is TRUE
. The key variables to consider are the Retrosheet catcher id variable POS2_FLD_ID
and the pitcher id variable PIT_ID
. One collects frequencies (number of plate appearances) of all catcher-pitcher pairs by use of the group_by()
and summarize()
functions. By ordering the output by the number of PA by use of the arrange()
function, we obtain the most common batterymates. Here’s the R syntax for this entire filtering, grouping, counting and arranging exercise.
pbp.59.79 %>%
filter(BAT_EVENT_FL == TRUE) %>%
group_by(POS2_FLD_ID, PIT_ID) %>%
summarize(PA = n()) %>%
arrange(desc(PA)) -> S3
To make the output more readable, I use the inner_join()
function from the dplyr
package to merge this data frame with the player names from the People
data frame in the Lahman
package. I do this merging operation twice, once to add the catcher names and once to add the pitcher names.
inner_join(S3,
select(People, retroID, nameFirst, nameLast),
by = c("POS2_FLD_ID" = "retroID")) %>%
ungroup() %>%
mutate(Catcher = paste(nameFirst, nameLast)) %>%
select(Catcher, PIT_ID, PA) -> S3
inner_join(S3,
select(People, retroID, nameFirst, nameLast),
by = c("PIT_ID" = "retroID")) %>%
ungroup() %>%
mutate(Pitcher = paste(nameFirst, nameLast)) %>%
select(Catcher, Pitcher, PA) -> S3
Most Popular Batterymates
By selecting the top 20 rows of the data frame S3, we display the batterymates with the 20 highest PA in the McCarver era from 1959 through 1979:
Catcher Pitcher PA
<chr> <chr> <int>
1 Bill Freehan Mickey Lolich 9701
2 John Roseboro Don Drysdale 7792
3 Jerry Grote Tom Seaver 7352
4 Tim McCarver Steve Carlton 6885
5 Tim McCarver Bob Gibson 6528
6 Randy Hundley Fergie Jenkins 6179
7 John Roseboro Sandy Koufax 5575
8 Johnny Bench Gary Nolan 5438
9 Jerry Grote Jerry Koosman 5225
10 Bill Freehan Denny McLain 5180
11 Earl Battey Jim Kaat 5072
12 Ed Herrmann Wilbur Wood 5004
13 Carlton Fisk Luis Tiant 4978
14 Elston Howard Whitey Ford 4872
15 Earl Battey Camilo Pascual 4601
16 Elrod Hendricks Mike Cuellar 4587
17 Ted Simmons Bob Gibson 4474
18 Manny Sanguillen Dock Ellis 4465
19 Randy Hundley Bill Hands 4456
20 Bob Boone Jim Lonborg 4293
Some interesting observations from looking at this list.
- Tim McCarver appears 4th and 5th in the most popular battery list with the pitchers Steve Carlton and Bob Gibson.
- The Bill Freehan/Mickey Lolich battery was the most popular in this era with a remarkable 9701 plate appearances, followed by John Rosboro/Don Drysdale with 7792 and Jerry Grote/Tom Seaver with 7352.
Catchers with Multiple Pitchers
In this top-20 list of most common batteries, there are six catchers appearing with multiple pitchers:
- Bill Freehan (Mickey Lolich and Denny McClain)
- Earl Battey (Jim Kaat and Camilo Pascual)
- Jerry Grote (Tom Seaver and Jerry Koosman)
- John Roseboro (Don Drysdale and Sandy Koufax)
- Randy Hundley (Fergie Jenkins and Bill Hands)
- Tim McCarver (Steve Carlton and Bob Gibson)
Pitchers with Multiple Catchers
On the other hand, there is only one pitcher in this top-20 list, Bob Gibson, who appears with multiple catchers (Tim McCarver and Ted Simmons).
Questions of Interest
This brief exploration suggests some questions for further study.
- Over MLB history what were the most popular batteries? Given that it was more common for pitchers to pitch many innings in the past, perhaps the batteries with the largest PA values happened early in MLB history.
- In contrast, it would be interesting to explore batteries in the last 20 seasons of baseball. Given the large movement of pitchers and catchers across teams, it may be difficult to find particular batteries with large numbers of PA.
- What was the most successful battery in MLB history? I’d suggest focusing on a particular baseball era and considering only the batteries with a sufficient number of PA. Among these batteries in the era, which battery had the smallest wOBA? Which battery had the largest number of average strikeouts per nine innings? The fewest number of walks per nine innings?
- Pitchers like Steven Carlton preferred using particular catchers like Tim McCarver. Is there any evidence to suggest that Carlton performed better (using a reasonable measure) when he had McCarver behind the plate instead of another catcher?
- Generally why do pitchers prefer to have specific catchers? Is there any way to measure the advantage of one catcher over another catcher for a specific pitcher using a measure of performance?