I'm a baseball fan living in New York City. In between long tirades about the New York Yankees and the national pastime in general, I'm a graphic designer.
With slow news days in the baseball world lately (the Yanks are
spending money; an
idiot ump is on the loose) and plenty of chaos at my j-o-b, I've been retreating to the serenity of my spreadsheets lately. It looks as though I'll be spending a bit more time there, as I've taken it upon myself to run all of the 2002 Major League pitching statistics through my DIPS 2.0 spreadsheet.
I don't take credit for the DIPS (Defense Independent Pitching Statistic) system. It was invented by a man named
Voros McCracken, and he's presented DIPS numbers for the 1999-2001 seasons via his web site while explaining the system on
Baseball Primer and
Baseball Prospectus. The gist of it is that McCracken did some studies on pitching statistics involving balls in play and concluded that major-league pitchers do not differ greatly in their ability to prevent hits on those balls hit into play (that is, anything that's not a home run, a strikeout, a walk or a hit-by-pitch). The rate at which a pitcher allows hits on balls in play is due more to the defense playing behind him than to his own skill, and can vary greatly from year to year.
This is somewhat counterintuitive, but it's also a very helpful way of looking at pitching stats. DIPS takes the elements of a pitcher's record that are not affected by the defense -- walks, strikeouts, hit-by-pitches, homers -- and places them in a neutral context for park, league and defense. The result is a translated line of Defense Independent Pitching Statistics, including a DIPS ERA; that is, an ERA based on defense-independent pitching performance. An important thing about this DIPS ERA, McCracken found, is that it correlates better with the following season's ERA than the pitcher's actual ERA does.
For one reason or another, Voros decided not to publish DIPS numbers this year, leaving a sizeable void in the sabermetric universe. But he's already published fairly coherent
instructions on how to calculate DIPS (and he encouragingly answered questions about some of the less coherent aspects of it), so I built a spreadsheet that would do the job. I used it for a few pieces about the
Yankee pitchers and this year's crop of
relievers figuring the sheet would give me a jump in the analysis department, but that it was only a matter of time before somebody published complete DIPS for 2002, and more power to them.
Insert sound of crickets chirping.
Nobody's done so, including myself -- mainly because I was never able to get my hands on the raw data in a spreadsheet. But via a rather mundane
Primer thread, I managed to find somebody ("mathteamcoach" is his handle) who had most of what I needed. We've joined forces to share the tedium of entering Intentional Base on Balls and Batters Faced Pitching data for EVERY SINGLE PITCHER in the service of this project. It's a dirty job but somebody's got to do it, and between the two of us we're about 2/3 done. The results should be finished later this week.