New Baseball-Reference Look and Importing Data into R via the Clipboard

I don’t know if you have noticed, but Baseball-Reference came out with a new web design. It seems to be a significant improvement. It has a clean look and works better on devices like smartphones and tablets. Anyway I think that this new design is a good reason for explaining how to easily import data from Baseball-Reference into R.

Suppose for example that we want to bring in the season-to-season pitching data for the HOF player Sandy Koufax. We locate the page by searching for Koufax.

In the Standard Pitching table, we select the “Share and more” menu.

If we choose the “Get table as CSV (for Excel)” option, we see the pitching data as text with commas separating fields:

We select the data, omitting the last two lines (which contain summary information), and copy it on the Clipboard.

In R, an attractive way to read in csv data is by use of the read_csv() function in the readr . This is not to be confused with the read.csv() available in the base R package.

Essentially, after loading the package, you paste the data from the Clipboard inside the quotes.

Koufax <- read_csv(" ")

I do this below.

Koufax <- read_csv("Year,Age,Tm,Lg,W,L,W-L%,ERA,G,GS,GF,CG,SHO,SV,IP,H,R,ER,HR,BB,IBB,SO,HBP,BK,WP,BF,ERA+,FIP,WHIP,H9,HR9,BB9,SO9,SO/W,Awards

To check to see if we have reasonable data, I’ll graph Koufax’s ERA values against Age. Koufax has a unique trajectory — after struggling for a few years, he was a remarkable pitcher for five years and, due to injury, had to retire at the peak of his career.

ggplot(Koufax, aes(Age, ERA)) +
  geom_point() + geom_smooth()

I think this Clipboard method is an attractive method of importing data, especially for the introductory R user who wants to import sports data quickly into R.

Late Addition

I thought it was worth mentioning that there is a different way of importing Clipboard data on a Macintosh by use of the pipe function with the “pbpaste” argument. Here I illustrate this with the Koufax data (assuming the Baseball-Reference csv data has been placed on the clipboard).

Koufax <- read.table(file = pipe("pbpaste"),
                    sep = ",", header=TRUE)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: