The Futility Infielder

A Baseball Journal by Jay Jaffe I'm a baseball fan living in New York City. In between long tirades about the New York Yankees and the national pastime in general, I'm a graphic designer.

Tuesday, May 11, 2004

 

Productive Outs, Revisited

The day after I visited the Productive Outs issue, Larry Mahnken published his own debunking of this nouveau statistic at The Hardball Times. Mahnken did a good bit of research on the issue; most importantly, he highlighted the fact that ESPN's Buster Olney didn't even define the stat correctly, something which everybody else (myself included) missed.

Mahnken's piece starts by taking issue with the way Olney presented his figures in the article, and he gets in several good points along the way:
Olney provides very little data, period, and what data he does provide is presented in a manner which will make the non-skeptical reader believe it supports him. The rate of productive outs is given for only 12 teams this season, the top six and bottom six. The top includes some teams that have surprised thus far, the bottom includes teams that have disappointed. The implication being that making or not making productive outs is the cause of their success or failure.

The only "Productive Out Percentage" numbers given for past years are the POP numbers for the Florida Marlins and Anaheim Angels last season, both of whom ranked in the top five. The implication is, of course, that making productive outs is the reason these teams won the last two World Series (over teams that currently rank in the bottom five).

Ignored is the fact that Florida's POP during the regular season last year is not particularly relevant to their postseason success, and that Anaheim's POP last season, when they finished 77-85, is not even close to being relevant to their postseason success in 2002.

It's clear that Olney did very little research for his article, and what research he did do was data mining, trying to find stats that supported his claims.

Because the data is compiled by the Elias "You'll Know What We Want You To Know" Sports Bureau, productive out data is impossible to find, making an independent study of regular season productive outs almost impossible. However, for the sake of discovering and spreading truth, rather than dogma, I did an independent study of the past two postseasons using the game logs available at Retrosheet. The study was long and tedious, but I believe the results were worth it.
Recall that in Olney's recent article, he defined Productive Outs as when:

* A baserunner advances with the first out of an inning.
* A pitcher sacrifices with one out.
* A baserunner is driven home with the second out of an inning.

Productive Out Percentage was described as "productive outs divided by the total number of outs." But as Mahnken points out, this is incorrect. The numbers offered for POP aren't based on dividing by total outs, nor on dividing by productive out situations. Rather, it's productive outs divided by outs made in productive out opportunities. Commented Mahnken at the newly reconstituded Baseball Primer: "As for dividing by outs in opportunities, it's the only way the stat is similar to what Olney listed. If you divide it by all outs, then the Marlins had something like a .090 POP last postseason."

Bless the patient soul that sits somewhere under Mahnken's Yankee cap, because Larry and I swapped emails three times before the sublety of the difference between "productive out situations" and "outs made in productive out situations" finally sunk into my thick skull. Using the aforementioned definitions of a productive out as p1, p2, and p3, here is the formula, along with a very similar one to which Mahnken refers in his article, the rate of productive outs (RPO). Larry did an excellent job of spelling out the difference between the two, so I'll quote his email:
POP = [productive out (p1+p2+p3)]/[outs in productive out situations]

RPO = [productive out (p1+p2+p3)]/[productive out situations (p1+p2+p3)]

If there's a runner on base with no out, that's a productive out situation. If the batter makes an out and the runner doesn't advance, his POP is .000. If he makes an out and the runner advances, his POP is 1.000. If he doesn't make an out, his POP is .---, because he made no outs.

If the batter has ten productive out situations and makes four outs, two of them productive, his POP is .500 [2/4] , his RPO is .200 [2/10].
It should be pointed out that the hypothetical batter above would have a .600 OBP in those productive out situations, at which point no respectable analyst alive would give a rat's ass about his Productive Out Percentage. But the difference in the formula is important. It's the anwswer to the question, "When this guy makes an out, how often is it a productive out?" but not, "How often does this guy make a productive out in a situation which a productive out can be made?" Those are two different numbers. Looking at it back in plain English, I can see why the definition would be prefereable to that of RPO, but the latter is the formula I'd assumed was being talked about when I first read the article. That Olney and whoever's editing him at ESPN couldn't even bother to define it correctly still galls me; that I didn't check it more closely for myself galls me only a bit less.

In the Hardball Times article, Mahnken goes on to point out that while eventual champs Anaheim and Florida did well in POP in the postseasons in which they won, both ranked third among the eight playoff teams in their respective campaigns. Many teams with higher postseason POPs went home earlier than the champs. A much better indicator of team success for the past two postseasons (72 games in all) that Mahnken found was on-base percentage in productive out situations (in other words, getting a hit or a walk instead of making a productive out). OBP in those situations had a .750 correlation with winning percentage, compared to .463 for POP and .283 for RPO.

Furthermore, the overall correlation of POP to winning percentage in the postseason sample was very low compared to more familiar overall indicators (that is, not just in productive out situations):
OBP: .841

SLG: .855
OPS: .874
POP: .463
In the words of Eric Cartman, "Dude, that is f---ing weak."

Mahnken concludes his article by trying to point out the fallacy of some offhand comments that Yankee announcers Jim Kaat and Paul O'Neill had been making about the current Yankee team. Olney wrote:
As club broadcasters Jim Kaat and Paul O'Neill noted last weekend, the team's offense is built much differently than in the championship years; in those seasons, the Yankees advanced runners, put runners in motion, bunted occasionally. While they didn't always have an overpowering offense -- the notable exception being the 125-win season of 1998 -- they had an efficient offense that provided the team's typically strong pitching enough runs to win.
Over the past two postseasons (one first-round loss, one trip to the World Series) the Yanks had a POP of .310, while the 1998-2000 teams (all of which featured O'Neill and ended in dogpiles on the pitcher's mound) their postseason POP was .268. While at first glance this seems worthy of a smirk at Kaat, O'Neill and Olney's expense, Mahnken himself already reminded us that postseason POPs weren't especially relevant to regular-season POPs; in this case, the trio has been harping on some heretofore unreported high regular-season POPs of the Yankee teams of yesteryear and comparing them to the current Yankee lineup, and we've only got a tiny, now-outdated sample of this year's model to go on. Hey ESPN, when are you going to update that chart now that the Yanks have started winning?

As tempting as it is to declare total victory over Olney's ignorant piece, the sample size issue is still something of a fly in the ointment. Seventy-two postseason games is less than half of one team's full-season schedule. A full season's worth of data for thirty teams would yield much more substantial (if not necessarily more conclusive) results, as would a full study which included the 142 playoff series since 1969 (the sample of which was the basis of Olney's postseason postmortem last November). As somebody who's basked in the raibow of tedium when it comes to baseball research, I can tell you taht Mahnken has done an admirable job of slogging through the play-by-play results thus far, but a larger-scale approach is needed to debunk the stat further.

Onto some other points I'd like to make on the issue...

One of the more interesting criticisms I received, both here and at Primer comes from one Nod Narb, who wrote:
Lots of well deserved criticism here. I agree with it all. However, I can't help but think that you haven't looked at Wilkins' BP study with the same critical eye. I know a number of studies just like Wilkins' have shown that Ks aren't detrimental to run scoring, but it's a flawed analysis. Not to get into it too much here, but you can't look at post-hoc outcomes, you also need to consider the other possibilities of balls in play. While a ball in play may lead to a double play, it may just turn into a regular out, it may fall in for a hit, or it may be booted...

If you're going to be so critical of articles by people who oppose sabermetrics, at least treat sabermetric articles with the same critical perspective.
First of all, I chose to focus on what the writer refers to as "post-hoc outcomes" rather than a more game-theory oriented approach because my interest in the stat was whether it had any predictive value on a large scale with regards to scoring runs, not on a micro level trying to divine what the batter's intent may have been. I chose Wilkins' study on strikeouts primarily because of its immediate accessibility rather than its air-tightness. I don't have the data facility to replicate the BP study, but they do this kind of stuff routinely and have staked a small empire on their ability to do so accurately. I won't give them a free pass, but given the scrutiny which the group's work receives internally, I have less reason to doubt that they've erred on the level of Olney's incorrect definition.

Regardless, looking at the post-hoc correlation of strikeouts to runs scored is only one way of looking at the matter. Another way of looking at it is to compare the value of a strikeout to that of a non-strikeout. For that I've turned to Tangotiger's estimable work on run estimation (the "How Runs are Really Created" series), which is a bit tricky to find given Baseball Primer's transitional dust -- it's in the Google cache, minus the graphics. In the first installment, Tangotiger's computation based on Retrosheet data from 1974-1990 puts the marginal value of a strikeout at -.269 runs, that of any out at -.265 runs -- not a huge difference, but a slight disadvantage to strikeouts if we're trying to predict the total number of runs. Across the broad range of 24 base-out combinations, a strikeout does slightly lower your run expectancy. Grounding into a double-play, of course is much more detrimental; in the comments section of that article, Tangotiger notes that the value of a GIDP is "about -.45 runs". Why the inexactitude given his propensity for precision, I'm not sure.

Elsewhere within that article is a chart which has some additional relevance to the situation at hand. As Earl Weaver's Fifth Law goes: "If you play for one run, that's all you'll get." Using the run expectancy matrix in my last piece, I showed the total number of runs expected in particular base-out situations such as moving a runner from first to second with the first out goes down. But the chance of scoring a single run, according to the data supplied by Tangotiger, actually increases:
Chance of scoring, from each base/out state

0 outs 1 out 2 outs
1B .38 .25 .12
2B .61 .41 .21
3B .86 .68 .29
So the runner who moves from first to second with the first out has a slightly higher chance of scoring (41% as opposed to 38%), even while the total run expectancy for the inning drops from .953 runs to .725. The runner moving from second to third on the first out has increased his chance of scoring to 68% from 61% even while the total run expectancy for the inning goes from 1.189 to 0.983. Note that moving a runner from second to third with the second out drastically decreases his chance of scoring, from 41% to 29%. Still, as there are times when a one-run strategy may be preferable -- to tie or win a game in the bottom of the ninth, or perhaps to get an early run on the board against a stingy pitcher -- advancing the runner with the first out will increase his chances of scoring. One run you want, one run you may get.

Somewhere Earl Weaver is smiling.

Comments:
This comment has been removed by a blog administrator.
 
Permalinks to individual post pages now work, and so do comments! How about that?
 
Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Archives

June 2001   July 2001   August 2001   September 2001   October 2001   November 2001   December 2001   January 2002   February 2002   March 2002   April 2002   May 2002   June 2002   July 2002   August 2002   September 2002   October 2002   November 2002   December 2002   January 2003   February 2003   March 2003   April 2003   May 2003   June 2003   July 2003   August 2003   September 2003   October 2003   November 2003   December 2003   January 2004   February 2004   March 2004   April 2004   May 2004   June 2004   July 2004   August 2004   September 2004   October 2004   November 2004   December 2004   January 2005   February 2005   March 2005   April 2005   May 2005   June 2005   July 2005   August 2005   September 2005   October 2005   November 2005   December 2005   January 2006   February 2006   March 2006   April 2006   May 2006   June 2006   July 2006   August 2006   September 2006   October 2006   November 2006   December 2006   January 2007   February 2007   March 2007   April 2007   May 2007   June 2007   July 2007   August 2007   September 2007   October 2007   November 2007   December 2007   January 2008   February 2008   March 2008   April 2008   May 2008   June 2008   July 2008   August 2008   September 2008   October 2008   November 2008   December 2008   January 2009   February 2009   March 2009   April 2009   May 2009   June 2009   July 2009   August 2009   September 2009   October 2009   November 2009   December 2009   January 2010   February 2010   March 2010   April 2010   May 2010  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]