The Futility Infielder

A Baseball Journal by Jay Jaffe I'm a baseball fan living in New York City. In between long tirades about the New York Yankees and the national pastime in general, I'm a graphic designer.

Monday, August 31, 2009

 

The Mad MVP Scientist

This past week, I toiled in my laboratory attempting to build an MVP predictor (ESPN Insider part 1 / part 2) based upon past results, one that might lend some insight into who would win this year.
As the recent scrum between supporters of the candidacies of Joe Mauer and Mark Teixeira reminds us, nearly every Most Valuable Player award is capable of producing controversy. Not only do the Baseball Writers Association of America voters rarely elect the player who's worth the most wins to his team via some objective formula, they appear to shift the standards from year to year, instead constructing narratives to fit whatever loosely-gathered facts are at hand. Particularly in recent years, defensive value is often minimized or entirely ignored in favor of heavy hitters with big Triple Crown stats, almost invariably from successful teams.

The question is whether the voters' behavior can be predicted. Towards that end, I was tasked with building an MVP predictor in the spirit of a system such as Bill James' Hall of Fame Monitor, one that awards points for various levels of achievement in an attempt to identify who will win, as opposed to who should win. My initial bursts of enthusiasm for the assignment were soon followed by endless hours of cowering in the fetal position before a massive spreadsheet, but in the end I emerged with a system — Jaffe's Ugly MVP Predictor (JUMP) — which correctly identified 14 of the 28 winners during the Wild Card era (1995 onward), and put 27 of those winners among the league's top three in its point totals.

I limited the scope of the system to that post-strike timeframe for three main reasons: none of the 28 winners were pitchers, only one (Alex Rodriguez in 2003) played for a team that finished below .500, and 22 of them played on teams that qualified for the expanded postseason — extremely strong tendencies that could help separate seemingly equal candidates. Instead of focusing on round-numbered benchmarks like James did (a .300 batting average, 100 RBI), I chose to dispense with actual stat totals and rates and focus on league rankings among batting title qualifiers (3.1 plate appearances per game) in 12 key offensive categories...
So anyway, I built a point system which rewarded top 10 placement in 12 categories (a few of which — OBP and hits, among others — turned out to be insignificant in predicting voter behavior), added a very strong team success component which could be worth more than two or three category leads, and then gerrymandered the hell out of the thing to increase the number of successful hits and top threes, the latter a concession to the fact that at some point subjective elements take over for a number of voters. My maneuvers included adding positional bonuses for middle infielders and a penalty for being primarily a DH, a penalty for playing for the Rockies, fractional weighting for a couple of categories — moves which through endless, tedious trial and error increased the system's accuracy bit by bit.

Here's how the actual award winners fared in JUMP, along with the players it flagged as the likely winners in years where they differed from the voting:
Year   AL Winner          Rank    System Winner
1995 Mo Vaughn 3 Albert Belle
1996 Juan Gonzalez 2 Albert Belle
1997 Ken Griffey 1
1998 Juan Gonzalez 1
1999 Ivan Rodriguez 10 Manny Ramirez
2000 Jason Giambi 1
2001 Ichiro Suzuki 2 Bret Boone
2002 Miguel Tejada 2 Alfonso Soriano
2003 Alex Rodriguez 1
2004 Vladimir Guerrero 1
2005 Alex Rodriguez 1
2006 Justin Morneau 3 Derek Jeter
2007 Alex Rodriguez 1
2008 Dustin Pedroia 1

Year NL Winner Rank System Winner

1995 Barry Larkin 3 Dante Bichette
1996 Ken Caminiti 1
1997 Larry Walker 2 Jeff Bagwell
1998 Sammy Sosa 1
1999 Chipper Jones 1
2000 Jeff Kent 3 Barry Bonds
2001 Barry Bonds 3 Sammy Sosa
2002 Barry Bonds 1
2003 Barry Bonds 1
2004 Barry Bonds 3 Albert Pujols
2005 Albert Pujols 1
2006 Ryan Howard 2 Albert Pujols
2007 Jimmy Rollins 3 Matt Holliday
2008 Albert Pujols 2 Ryan Howard
Ivan Rodriguez's 1999 victory — which still chafes my ass a decade on, because Derek Jeter had a monster year (349/.438/.552 with 24 homers, 134 runs and 102 RBI, all career highs) - is the system's big outlier, not to mention the only catcher who won during this era. That bodes poorly for Mauer, who as it is doesn't rank in the top 10 in any counting stat category and plays for a team unlikely to make the playoffs; he ranked just 28th when I ran the numbers on Sunday, and with his team's win to get right back to .500, that only pushes him to 15th. Mind you, this isn't a prediction that Mauer would finish 15th in the voting, or that he deserves to; as Mae West famously said, "Goodness has nothing to do with it." Basically what JUMP is saying is that history tells us that unless Mauer scores in the league's top three, he's got no chance of actually winning the award. Meanwhile, "Golden Boy" Teixeira leads the AL rankings thanks to running first in RBI, second in homers, and sixth in slugging while playing for a playoff bound team.

In all, it was a fun and satisfying project. I've got a few ideas that might increase its accuracy a hair, and I'll revisit the topic if they turn out to be worthwhile.

Labels: , ,


Comments: Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Archives

June 2001   July 2001   August 2001   September 2001   October 2001   November 2001   December 2001   January 2002   February 2002   March 2002   April 2002   May 2002   June 2002   July 2002   August 2002   September 2002   October 2002   November 2002   December 2002   January 2003   February 2003   March 2003   April 2003   May 2003   June 2003   July 2003   August 2003   September 2003   October 2003   November 2003   December 2003   January 2004   February 2004   March 2004   April 2004   May 2004   June 2004   July 2004   August 2004   September 2004   October 2004   November 2004   December 2004   January 2005   February 2005   March 2005   April 2005   May 2005   June 2005   July 2005   August 2005   September 2005   October 2005   November 2005   December 2005   January 2006   February 2006   March 2006   April 2006   May 2006   June 2006   July 2006   August 2006   September 2006   October 2006   November 2006   December 2006   January 2007   February 2007   March 2007   April 2007   May 2007   June 2007   July 2007   August 2007   September 2007   October 2007   November 2007   December 2007   January 2008   February 2008   March 2008   April 2008   May 2008   June 2008   July 2008   August 2008   September 2008   October 2008   November 2008   December 2008   January 2009   February 2009   March 2009   April 2009   May 2009   June 2009   July 2009   August 2009   September 2009   October 2009   November 2009   December 2009   January 2010   February 2010   March 2010   April 2010   May 2010  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]