# Pitch Selection, Entropy, and Establishing the Fastball

There is an interesting article in Sports Illustrated this week on the increasing use of the curve ball in baseball.  Of course, many pitchers throw high-speed fastballs in the mid-90’s, but this article emphasized the use of off-speed pitches.   I thought it would be interesting to look generally at the variety of pitch selection — which pitchers are especially good or poor in mixing up their pitches?

### Entropy

There is a useful concept from probability for understanding uncertainty in pitch outcomes.  Suppose there are three possible outcomes in tomorrow’s weather — rain, sunny, or cloudy and we represent the chances of the three outcomes by the vector (pR, pS, pC).  If any one of the probabilities is close to 1, say the vector is (0.9, 0.05, 0.05), we are pretty certain of the weather outcome.  In contrast, if the vector is (1/3, 1/3, 1/3), tomorrow’s weather is very uncertain.  A measure of uncertainty is the entropy E defined as the negative of the sum of probabilities and corresponding log probabilities:

E = – pR log(pR) – pS log(pS) – pC log(pC)

In our weather example, the entropy E for (0.9, 0.05, 0.05) is equal to E = 0.39, and the entropy for (1/3, 1/3, 1/3) is E =1.099.  Larger uncertainty will result in larger values of the entropy E.

### Entropy for Pitch Selection

Each pitcher throws a variety of different pitch types and one can summarize his pitch selection by a vector (p1, p2, p3, …) which gives the proportion of fastballs, sliders, cut fastballs, curveballs, changeups, etc.   This data is tabulated in FanGraphs in the Pitch Type category for pitchers.  I collected this data from FanGraphs for the ten seasons 2007-2016 for both qualifying starters and relievers.  I used entropy to measure the uncertainty in a pitcher’s pitch selection.  A pitcher who primarily throws one pitch would have low entropy and a pitcher who throws a lot of different types of pitches would have high entropy.  I wanted to address several questions.

1.  Does the variety in pitch selection vary among starters and relievers, and how has the typical entropy among pitchers changed over the seasons?
2. What pitchers currently have low and high values of entropy and is this related to the speed of his fastball?  (One might think that a pitcher might compensate for a slow fastball speed by throwing a greater variety of pitches.)

To address the first question, I graphed the average entropy against season for both starters and relievers.  I guess it is not surprising that starters tend to have more variety in pitch selection than relievers.  I don’t see much trend across seasons.  Perhaps relievers are showing less variety in pitch selection (on average) in recent seasons.

Here is a graph of the entropy against the fastball velocity for all the qualifying starters in 2016.   I’ve labeled some interesting points that deviate from the general pattern (there is a negative association between entropy and fastball velocity).

We see …

1.  Dickey and Colon are interesting in that they both have slow fastball speed and low entropy (low variety in pitch selection).  Dickey is not surprising (he throws primarily knuckleballs), but I am a bit surprised by Colon.
2. Jared Weaver does appear to compensate for a slow fastball with a high variety of pitch selection.
3. Fiers, Tanaka, Iwakuma, Shields, Samardzija all are high entropy pitchers; in contrast, Hendricks, Quintana, Happ, Nelson, and Sanchez are low entropy pitchers who likely rely on a small number of pitch types.

### Establishing the Fastball

I am currently reading an interesting book Off Speed by Terry McDermott and he talks about the importance that the pitcher establish the fastball early in the game.  That suggests that a pitcher’s pitch selection is a bit different in early innings — he tends to throw a higher percentage of fastballs.

This is easy to confirm using PitchFX data.  For all the 2016 starters, I found the percentage of fastballs (I included four-seam, two-seam and cutters) for each pitcher for each inning.  For each inning I computed the ratio

RATIO = (percentage of fastballs in the inning) / (overall percentage of fastballs)

In the following graph, I plot the ratio for all pitchers against the inning.  This basically confirms the general belief of establishing the fastball early.  Although the pattern varies among pitchers, on average it seems that pitchers throw 10% more fastballs in the first inning than average.  The pitchers tend also to throw a higher fraction of fastballs in the 2nd inning.  The variation between pitchers is interesting — Fernandez and Hendricks really like to establish the fastball in the 1st inning — Dickey (not unexpected) and Volquez actually use the fastball less than average in the 1st inning.