# The Cheating Scandal

#### Introduction

Last Saturday, I attended a meeting of the Cleveland chapter of SABR. I always enjoy these SABR meetings and this particular meeting was more interesting than usual due to all of the discussion about the sign-stealing scandal in baseball. Some people are saying that this is perhaps a bigger scandal than the steroids scandal of the Bonds, McGwire, Sosa, et al era. I agree. The purpose of this post is to provide some insight into the batter’s advantage when has some information about the pitch type that will be delivered. I’ll demonstrate this using a pitch type study of the Astro’s ace Justin Verlander.

#### Justin Verlander

Justin Verlander is a future Hall of Famer who pitches for the Astros. Verlander is well-known for his four-seam fastball that he throws over half of the time. But the chance that Verlander uses his fastball depends on the count as shown in the following graph. He is very likely to use his fastball when he is behind in the count — the 2-0, 3-0, 3-1 situations. On the other hand, when he is ahead in the count he is much less likely to use the fastball. (By the way, all of these graphs are based on Verlander data for the 2017, 2018, and 2019 seasons.)

#### Entropy

Generally a batter is uncertain whether or not he will see a Verlander fastball. One can measure the degree of uncertainty by the notion of entropy. If P is the chance of an event (on a 0 to 1 scale), then the entropy of P is defined by

E = – P log(P) – (1 – P) log(1 – P)

When the chance P is 0.5, then the entropy takes its highest value — if Verlander throws a fastball with chance 0.5, the batter really doesn’t know what to expect. It is like trying to predict the flip of a fair coin. But as P moves to 0 or 1, then the batter’s uncertainty will decrease. If the batter is able to steal the sign and knows the pitch type to come, then the uncertainty (entropy) in the pitch type will be reduced to zero. Here is a graph of the entropy of a Verlander fastball for all counts. Note that the entropy is low when the pitcher is behind in the count.