Monthly Archives: October, 2021

Perfect-k Games

Introduction

Recently Gino Renzulli sent me an email suggesting a new pitching stat:

“The statistic is simple in nature. I am seeking to find a “perfect through 6 innings” statistic for pitchers. That is to say we would discover what pitchers have continued a perfect game through 6 innings the largest number of times. I chose 6 as the number of innings because it would indicate dominance twice through the order, but of course the data would be most interesting with a sliding scale of perfect through any ‘x’ number of innings.”

We are very familiar with a Perfect Game where a starter pitches 9 innings with no batter reaching base. There have been only 23 perfect games in MLB history, the most recent being Felix Hernandez on August 15, 2012. Since perfect games are pretty rare, it seems interesting to generalize the notion of a Perfect Game to a similar event that is more likely to happen. In particular, let’s consider games where a starter throws k consecutive “no batters reach base” innings from the 1st inning — we’ll call this a “Perfect-k Game”. (Using this terminology a Perfect Game would be equivalent to a Perfect-9 Game.). In this post, we’ll provide a historical perspective on Perfect-k Games and find the pitchers who have done well using this statistic during the 2000-2019 period.

Using Retrosheet Data

Perfect-k Games can be found using Retrosheet data. For each game, one checks if no batter reaches base in a half-inning by seeing if the number of batters is equal to the number of outs (strikeouts or other outs). For each game, a pitcher has, say, a Perfect-5 game if the number of batters for the complete innings 1 through 5 (15 batters) is equal to the number of outs. I repeat this exercise using the Retrosheet play-by-play files for the 2000 through 2019 seasons.

Counts of Perfect-k Games

We have already recorded, say on this Wikipedia page, the Perfect-9 games which are the Perfect Games in baseball history. We are interested in counting the Perfect-1, Perfect-2, …, Perfect-8 games in each of the 20 seasons 2000 through 2019.

Here are graphs of the counts of Perfect-1 through Perfect-6 games for these 20 seasons.

Some comments:

  • There are 2430 games during a complete baseball season. Since there are two starting pitchers, there are 2 x 2430 = 4860 opportunities to throw Perfect-k games. We see, for example, there tend to be about 1400 Perfect-1 games in a season which represents about 29% of all opportunities. This is surprising — I thought there would be a greater percentage of 1-2-3 first innings.
  • Following is a table showing the total count of Perfect-k games (over the 20 seasons) and the corresponding percentage for values of k equal to 1 through 8. Reading the third row, we see for only 3.33% of the game opportunities, there is a Perfect-3 game where the entire lineup 1-9 does not reach base in the first three innings. There have been 17 Perfect-8 games in this period which represents only 0.02% of the game opportunities. There has been a total of 7 Perfect Games in this twenty season period which (by subtraction) means that there were 10 games where the pitcher had a Perfect-8 game and either the starting pitcher was replaced or a batter reached base in the ninth inning.
    N_Innings Total Percentage
      <int> <int>      <dbl>
1         1 28288      29.1 
2         2  8661       8.91
3         3  3237       3.33
4         4   951       0.98
5         5   313       0.32
6         6   110       0.11
7         7    39       0.04
8         8    17       0.02
  • Looking at the six graphs, I see a general increase in the number of Perfect-k games starting with the 2010 season. Maybe this indicates that pitchers tend to be more dominant after 2010?
  • There are interesting outliers in these graph. For example, the number of Perfect-k games seems unusually low in the 2000 season and the counts of Perfect-3, Perfect-4 and Perfect-5 games seem high in the 2014 and 2015 seasons. Some more exploration is needed to understand what is happening for these particular outlier seasons.

Great Perfect-k Pitchers

Gino was particularly interested in pitchers who threw Perfect-6 games — these are games where the starter went through the lineup twice without anyone getting on base. Are there pitchers who excel in Perfect-6 games?

Here is a leaderboard of Perfect-6 games during the period 2000-2019. We see Mark Buehrle stands out with 3 Perfect-6 games and 11 other pitchers had 2 Perfect-6 games.

   Name              Perfect_6
 1 Mark Buehrle              3
 2 Jake Arrieta              2
 3 Madison Bumgarner         2
 4 Bartolo Colon             2
 5 Yu Darvish                2
 6 Armando Galarraga         2
 7 Rich Hill                 2
 8 Clayton Kershaw           2
 9 Colby Lewis               2
10 Odalis Perez              2
11 Max Scherzer              2
12 Ben Sheets                2

Let’s try counts of Perfect-5 games. Here is a listing of the pitchers who had at least four Perfect-5 games — Mark Buehrle and Jon Lester are on top and I am happy that two former Phillies, Roy Halladay and Curt Schilling, are on this leaderboard.

  Name           Perfect_5
  <chr>              <int>
1 Mark Buehrle           6
2 Jon Lester             6
3 Yu Darvish             5
4 Max Scherzer           5
5 Roy Halladay           4
6 German Marquez         4
7 Curt Schilling         4

Since you might be wondering, here is the leaderboard for the number of Perfect-3 games, which includes some of the best starting pitchers during this twenty season period.

Name             Perfect_3
  <chr>                <int>
1 Justin Verlander        28
2 CC Sabathia             26
3 Max Scherzer            26
4 Clayton Kershaw         24
5 Tim Hudson              23
6 Bronson Arroyo          21

Closing Remarks

  • Do We Need Another Pitching Measure? We are interested in no-hitters and starters need to pass through these Perfect-1, Perfect-2, Perfect-3, … milestones on the way to a no-hitter. So I think these Perfect-k measures are useful for understanding a pitcher’s dominance.
  • Randy Johnson? Gino thought that Randy Johnson would do well with respect to the Perfect-k measure. I checked. We do know that Johnson threw a perfect game against the Braves on May 18, 2004. Using my function, I find that Johnson had only two Perfect-6 games (May 18, 2004 and May 16, 1993), but he did have six Perfect-5 games in his career.
  • Career Leaderboard? That raises the question — which pitcher had the greatest number of Perfect-6 games in MLB history? It would not be difficult to use Retrosheet data to find a career leaderboard for counts of any Perfect-k of interest.
  • R Code? I wrote a short R function perfect() that finds all of the games where pitchers attain a Perfect-k distinction. You can see the function on my Github Gist site. One inputs the Retrosheet data frame for a particular season and the value of k. The output is a data frame containing the game id and the retro id of the starting pitcher for all Perfect-k games that season By repeated use of the function perfect() for different Retrosheet season datasets and values of k, one obtains the results that are illustrated here.