Blog code in GitHub repository

The code shown in this blog is now collected in our GitHub repository, which also contains the code used throughout the book.


About these ads

9 responses

  1. I’m working through the book and having trouble getting large files like all1998.csv and all2011.csv to download from the GitHub repository. The raw link gives the error “Error: blob is too big”. Is there another way I can get those files? Thanks!

  2. Appendix A.1 of the book describes how to download play-by-play Retrosheet data for a particular season. We give a function parse.retrosheet.pbp that does the downloading of the individual files and puts them together into a single file like all1998.csv.

  3. Thank you for responding!

    I am now trying to go through Appendix A to create the files but I’m having a problem. I have followed the directions but when I enter source(“parse.retrosheet.pbp.R”) I get the following error:

    Error in file(filename, “r”, encoding = encoding) :
    cannot open the connection
    In addition: Warning message:
    In file(filename, “r”, encoding = encoding) :
    cannot open file ‘parse.retrosheet.pbp.R’: No such file or directory

    I have tried changing my working directory to C:/Retrosheets, C:/Retrosheets/download.folder, C:/Retrosheets/download.folder/unzipped, C:/Retrosheets/download.folder/zipped, etc, etc…all give the same error.

    Am I supposed to download parse.retrosheet.pbp.R from somewhere?

    Thank you!

  4. Isaac, did you try to download the full GitHub? Just click on the Download ZIP button (right side), then extract all the content of the downloaded zip in a folder of choice. And start with the _setWorkingDir.R file in the script folder.
    After that, everything should work fine.

  5. I’d like to thank you guys for being so helpful. I was able to copy the code from the book for the parse R file into notepad and load it into R. Now I can get the all files easily. Back to working through the book!

    I think only one of you guys is American, but Happy Thanksgiving to you both!


  6. I had the same problem as Isaac (Nov 27, 2:05 post). I downloaded and unzipped the zipfile, modified the _setWorkingDir.R script, and sourced it in R. R still cannot find the parse.retrosheet.pbp.R script. I noticed Isaac reverted to keying in the code. What am I doing wrong? I’m enjoying the book. Thanks for your help.

  7. Hi Craig,

    I’m as new to this as you are, but maybe this will help. The following are instructions I wrote for myself when I was having trouble so I would remember in the future:

    1. Create a Retrosheets directory

    2. Create a new folder ~\Retrosheets\download.folder

    3. Create 2 new folders – ~\Retrosheets\download.folder\unzipped and ~\Retrosheets\download.folder\zipped

    4. Create parse.retrosheet.pbp.R file
    a. Open new text document in a text editor
    b. Copy the code (NOTE: this is at the end the of this post)
    c. Save as ~\Retrosheets\parse.retrosheet.pbp.R

    5. Go to and download latest version to ~zipped directory.

    6. Unzip contents to ~unzipped folder. There should be 5 .exe files, the important one being cwevent.exe

    7. Open R (RStudio is preferred)

    8. Set working directory (setwd) to the ~Retrosheets directory

    9. Load R script using by entering “source(“parse.retrosheet.pbp.R”)” in R

    10. Run the script for the desired year using by entering “parse.retrosheet.pbp()” in R

    11. 2 new files will appear in the unzipped directory
    a. all.csv
    b. roster.csv

    parse.retrosheet.pbp.R code:

    parse.retrosheet.pbp <- function(season){
    download.retrosheet <- function(season){
    url=paste(";, season, "", sep="")
    , destfile=paste("download.folder", "/zipped/", season, "", sep="")

    unzip.retrosheet <- function(season){
    unzip(paste("download.folder", "/zipped/", season, "", sep=""),
    exdir=paste("download.folder", "/unzipped", sep=""))

    create.csv.file <- function(year){
    wd all”, year, “.csv”, sep=””)))

    create.csv.roster <- function(year){
    filenames <- list.files(path = "download.folder/unzipped/")
    filenames.roster <-
    subset(filenames, substr(filenames, 4, 11) == paste(year,".ROS",sep=""))
    read.csv2 <- function(file)
    read.csv(paste("download.folder/unzipped/", file, sep=""), header=FALSE)
    R <-"rbind", lapply(filenames.roster, read.csv2))
    names(R)[1 : 6] <- c("Player.ID", "Last.Name", "First.Name",
    "Bats", "Pitches", "Team")
    wd <- getwd()
    write.csv(R, file=paste("roster", year, ".csv", sep=""))

    cleanup <- function(){
    wd <- getwd()
    shell("del *.EVN")
    shell("del *.EVA")
    shell("del *.ROS")
    shell("del TEAM*")


  8. This is so frustrating. I followed all of these instructions perfectly and now I get “Error: could not find function “parse.retrosheet.pbp””. Can SOMEONE please post usable advice? Yes, I am in the correct working directory. I can SEE the function in the window.

    1. Eric, have you sourced the function into R before you use it? That is, type


      This should work if that function is in the current working directory.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 28 other followers

%d bloggers like this: