This season, the debate between those who like using wins to judge pitchers and those who want nothing more than to forget that statistic exists has heated up and we’ve seen the movement heavily publicized by MLB Network’s Brian Kenny, who takes on “wins” on a daily basis.
The argument against using wins is simple. The way pitcher wins are determined does not reflect individual pitcher performance, and therefore is an improper judge of how well someone performed. There are countless examples, most clearly Cliff Lee last season and James Shields and Chris Sale this season. Last week, we took on some of the best seasons ever by pitchers who won 9 or fewer times in a season. So much of what leads to wins is completely out of the pitcher’s control and they shouldn’t be judged based on how many runs their team scored for them. Run support, even if we strip away defense, the opposing pitcher, and dumb luck, is a clear and important factor in how many wins you have.
Last week, I gave you this graph which showed that in the 8,000+ qualifying seasons since 1901, wins did very little to explain overall performance:
But those numbers just reflect single seasons. I started wondering about bigger samples. Pitchers can get really lucky or unlucky in a given start and clearly they can in given seasons, but what about in their careers? Can you fake your way through an entire career of wins? It turns out that you can. Let’s take a look.
Below is a graph of Wins per Start (so as to control for guys who made 400 starts and guys who make 250 starts) and ERA- (which is simply ERA scaled to league average during that era and adjusted for park effects. Lower ERA- is better and 100 is league average, meaing ERA- of 90 is 10% better than average). What you see here is that wins fare no better in career samples than season ones (sample size of 2,155):
The trend line is clear in that the lower your ERA-, the more frequently you win, but there is significant variation at each point. For example, at a wins per start of 40%, some pitchers have ERA- of 80 and some have ERA- of 120. The adjusted R squared here is .3966, which means that only 40% of the variation in ERA- can be explained by Wins per Start. That’s less than half.
If we used FIP-, which is the scaled version of Fielding Independent Pitching (FIP), the results are even more troubling for wins.
The adjusted R squared here is only .2131, meaning that only about 21% of the variation in FIP- can be explained by Wins per Start. You can win 50% of your starts as the best pitcher of all time or as one of the worst.
The takeaway here is very simple and very important. Your ability as a pitcher to keep the other team from scoring (as seen with ERA-) and your ability to prevent runs based on only that which you can control (FIP-) are not that heavily correlated with winning. You can’t use a pitcher’s wins to predict how good they are because you can win if you prevent runs like a superstar or if you prevent runs like a Triple A long reliever. Even if you strip out defense and the quality of the other offense and give the pitcher credit for every single run he allows, there is still the issue of team run support that he has zero control over.
Last week I provided simple, straightforward evidence for why wins don’t reflect performance over the course of the season, but here I’ve shown that wins don’t even tell you much over the course of an entire career. It’s the job of a starting pitcher to limit the runs they allow, but the ability to limit runs doesn’t correlate very well with how often you win because so much of that is out of your hands.
Wins are not a good measure of individual performance and we should stop using them as such. This isn’t because sabermetricians don’t understand the point of the game, which is to win, but rather because we understand that “wins” as a stat for pitchers tells us nothing about how much they contributed to helping their team win. Pitchers try to prevent runs. That is only half of the game. They shouldn’t be praised or blamed for what happens on the other side.