A couple of days ago I was able to attend the Carnegie Mellon University Sports Analytics Conference in Pittsburgh. It was a great well-organized meeting — it included a variety of talks by people from CMU and outside speakers on different sports. I thought I’d use this week’s post to highlight several presentations that I found particularly interesting.
Brian Burke’s Keynote Address: Brian Burke is a senior sports analyst at ESPN — Brian is well-known for his early use of win probabilities in football. Brian gave an interesting talk describing his career from officer and aviator in the U.S. Navy to his current position at ESPN. He emphasized that the E in ESPN stands for entertainment and one always has to view their work in sports analytics from an entertainment perspective.
Since I’m mentioning win probabilities, I have to share (from Baseball-Reference’s game page) the win probability graph from last night’s wild Game 5 between the Astros and the Dodgers. There were five specific plays that each had over a 25% change in the win probability — the walk-over single by Houston’s Alex Bregman changed the win probability by 39%!
Ron Yurko’s Talk on nflWAR. Ron is a doctoral student in Statistics at CMU and also has been active in the Carnegie Mellon Sports Analytics Club. Ron gave a very interesting talk that essentially applies the baseball openWAR approach to football. I am especially interested in his use of multilevel models to measure the contributions of individual offensive players in the NFL. Football is more challenging than baseball from an analytics perspective since offensive contributions such as running and passing are clearly a result of a number of players. In baseball, we have a similar problem in understanding how much runs scored against depends on the pitcher and the fielders.
Luke Bornn’s Talk Mapping NBA Strategies. Luke works for the Sacramento Kings and described interesting research on categorizing offensive plays in the NBA — he calls this machine learning algorithm “Possession Sketches”. In text mining, there currently exists methodology on how to collect, explore, and categorize collections of text, and Luke is trying to use these methods to categorize descriptions of offensive plays. He effectively used video clips to illustrate his methods. I think this project is potentially very useful for coaches in getting a better understanding how players contribute to scoring plays.
Rob Engel’s Talk on StatCast. Ron works for MLB Advanced Media and has been involved in the building of the baseball data pipeline by MLBAM. He provided an overview of the type of data they collect, the equipment installed in all MLB parks, and how this data is used by teams, broadcasting, and the public (applications such as MLB At Bat). I wondered if MLBAM has reached the limit to what can be measured on a baseball field and the answer is no — for example, they measure only the location of the middle of a person’s position and they would like to know more about the location of feet which might be helpful in understanding how runners move about the bases.
A Panel Discussion. There was an interesting panel discussion with four people who do sports work in the Pittsburgh area: Bob Cook is a former pitcher who does analytics work for the Pirates, Buddy Clark is a professor of mechanical engineering and material sciences at the University of Pittsburgh who works for Diamond Kinetics, Karim Kassam has a background in electrical engineering and computer science and does analytics work for the Steelers, and Sam Ventura is a recent phd in Statistics who is the director of hockey research for the Penguins. Here are some highlights from their discussion:
Question to Panel: Advice for students wanting to work in the sports analytics field? Look for opportunities and just dive in and do it. It is helpful if you know some programming language. It is important to know SQL (database language). Read blogs and other articles so you knowledgeable about the field. Produce your own analytics work so you can showcase what you can do. Teams are interested in people who have passion and skills for doing useful work.
Question to Panel: What is the hardest part of your job? Communicating with people who have different backgrounds. Putting data into different forms — trying to make the process of working with data as efficient as possible. Prioritizing what to work on. Writing good well-documented code for your work.
The impression I got from this panel discussion was that a sports analyst has to work with many different people on a team including the front office, coaches, and players. There can be tension between coaches, players, and analysts which creates challenges in doing one’s job. One’s ability to communicate, that is, express statistical ideas in helpful ways, is very important. Teams (especially the ones outside of baseball) don’t have large analytics groups, so an analyst will typically have many responsibilities.
I could say more about the meeting, but generally I left being very impressed with the activity at CMU — they have a relatively large sports analytics club for students with support from the Statistics department. It would be great if Bowling Green could have a similar student sports analytics club. I believe this CMU meeting will be a yearly event and I am already interested in attending the 2018 meeting next fall.
Added Nov 4, 2017:
Related to the panel discussion, I just saw an interesting article about Getting (a job) into Sports Analytics which makes a lot of good points.