Stat of the Week: Wins Above Replacement (WAR)

Clip art illustration of a Cartoon Tiger with a Missing Tooth

Let’s get this out of the way early because it’s going to come up in a couple weeks when SABR Toothed Tigers New English D hands out a very controversial MVP award to someone not named Miguel Cabrera.

This week’s Stat of the Week is Wins Above Replacement (WAR). It’s been in the news a bit lately, so let’s all get on the same page about it.

First, there is something you should know. There are two different WAR. One belongs to Baseball Reference, one belongs to Fangraphs. They are different primarily because they use different measures of defense (more on this later). I will always cite Fangraphs on this site, but only for the reason that I like how they present their data.

The concept behind WAR is the same for both sites. How many wins does a given player add to his team above what a replacement level player would? A replacement level player is defined as a widely available AAA type player. Think Mike Hessman, Jeff Larish types if you followed Tigers minor leaguers in the last decade.

This is a pretty simple idea. What is the difference in wins between Prince Fielder at 1B versus Jeff Larish at 1B if everything else were equal? That is WAR in the abstract.

More concretely, a team that only played replacement level players would win about 50 games per season. As we’ve mentioned a lot here, even terrible teams win sometimes.

So what does WAR look like? For position players, you want to post at least a 2.0 WAR in a season to be considered a “starter.” Below that and you’re a backup or a minor leaguer. 2.0-4.0 WAR is considered solid, 4.0-6.0 is pushing All-Star to superstar levels, and 6.0+ is MVP type guys. You can roughly use the same scale for starting pitchers. Relievers are much different because they play so much less. Better than 1.0 WAR for a reliever is good, 2.0 is great, and 3.0 is excellent.

Let’s talk theory first. The common retort to this is that “Miguel Cabrera has to be worth more than seven wins to the Tigers! If you took him out, they’d suck!”

This isn’t really accurate. Think about it. The Tigers won 88 games, Cabrera posted a 7.1 WAR. Let’s round up to 8.0 WAR to be generous as that difference is attributed to what WAR considers poor defense (more on this later). If the Tigers did not have Cabrera and replaced him with a minor league player ala Ryan Strieby, the Tigers would go 80-82 according to WAR theory. That’s actually pretty realistic if you just look at it. That’s 9% of their wins concentrated in 4% of their roster.

80-82 isn’t very good, but it’s not horrible. After all, they Tigers have a good team around him. Let’s take away Verlander’s 6.8 WAR (rounding up to 7) and we’re at 73-89. Good for fourth worst in the AL. Essentially, if we take Verlander and Cabrera off the Tigers according to WAR, they would only be better than Cleveland, Minnesota, and Boston in 2012.

You have to buy that. They still have Fister, Scherzer, Sanchez, Fielder, Jackson, etc. They would be much worse, but still not a minor league team. Take away Fielder’s 4.9 WAR and we’re down to 68-94. Only the Twins were worse. That sounds about right when you really think about it.

So that’s the theory, but what about the practice. How do we calculate WAR? What WAR seeks to do is combine hitting, baserunning, and defense into a single number calibrated to the only thing we actually care about, wins. Each action earns a “run value” based on how often that action contributes to run scoring and the accumulation of 10 runs is about equal to 1 win.

WAR takes into account how much better than average a player is offensively using wOBA and coverts it into an overall run value, wRAA, based on the number of plate appearances a player has had. You take that wRAA and divide it by the Run to Win value of that year (usually about 9 to 10). That gives you offensive WAR. Baserunning has a similar type formula based on how many bases you take and how many you steal. Defense is based on UZR for FanGraphs and DRS for Baseball-Reference, which all come out in run values converted into wins in the same way. Overall WAR is also adjusted for the position you play.

For pitchers, FanGraphs uses Fielding Independent Pitching (FIP) and includes the number of innings pitched, park effect, and similar adjustments and Baseball Reference uses runs allowed and controls for the quality of your defense.

Simply put, WAR is trying to measure the total contribution a player makes with his play on the field. It obviously doesn’t measure things like leadership that reflect on other players (or moving to a new position!), but everything they do on the field is captured. Surely no one can challenge this concept.

WAR takes various statistics and combines them and scales them to churn out a number. The math is based on baseball history and what has been shown contributes to winning. For example, WAR values OBP over AVG because walks are important, but missing from average. It doesn’t care so much about RBIs because you can’t drive in runs if no one gets on base ahead of you. The math behind this, which I won’t subject you to any more unless you really want me to (here’s Fangraph’s page on WAR), is rooted in the game’s history and they adjust it every year to pick up new information, but it’s always scaled to that year’s replacement level so you can compare across eras.

WAR is not a perfect, exact measure of a player’s value, but it is a good one if you sum a team’s WAR and compare it to their actual win total + 50 (again this number has been slightly adjusted). It’s not a be all end all. If a player is 4.6 WAR and another is 4.5 WAR, they are essentially equal. There is margin of error. But WAR does give you a good sense of how much this player helps his team win with his own performance.

The argument against WAR is twofold. First, it’s complicated. It turns out a pretty good number, but it’s hard to grasp. You can’t watch a game and immediately see how Player X’s WAR is impacted like you can with HRs or average or walks. It’s not a stat, it’s a metric. It weighs the value of each action based on how those actions normally lead to wins for your team. So it’s hard to follow. You have to look at the numbers, you can’t figure them out and follow them as well. I’m not arguing we throw the others out in favor of WAR, but when you want to compare players who player different positions and on different teams, WAR equalizes that through a positional adjustment and other devices.

The other problem with WAR is defense. Defense is really hard to measure. Fielding percentage is not a good measure because that only tells me how often you make errors, it doesn’t tell me what kind of errors. It doesn’t tell me about your range.

WAR uses UZR (Ultimate Zone Rating) or DRS (Defensive Runs Saved) to measure defense. They are both metrics based on range and execution, with human viewers judging every play based on if it should be made routinely and how much harder or easier a play is from average difficulty. So there’s some subjectivity, but it’s much better than any of the traditional numbers. Plenty of people criticize these numbers because they fluctuate a lot and give some weird results on occasion.

Essentially, defense is WAR’s weak leg, but it’s getting better and is much better than anything traditional. But this means we can’t use WAR as a final word. We have to look at other things and use our eyes.

Don’t run from WAR because it is complex math. You can check it yourself by seeing what it turns out in a given year compare to a team’s actual results. The Tigers position player WAR and pitcher WAR sums to 43.9 this season. 50 + 43.9 = 93.9 which misses the Tigers win total by a whopping 6%. Not bad. It works even better with bigger samples.

Question WAR because it may be imperfect. It’s not trying to boil baseball down to a spreadsheet, it’s trying to correctly value on field actions. Having more RBI doesn’t make you a better player than someone else. Hitting more homeruns doesn’t either. If no one gets on base for you, you can’t drive them in. If you play in San Diego instead of Cincinnati, you’ll hit fewer bombs.

WAR is an equalizer. It allows us to compare individuals playing a team game. It’s a good thing. Don’t take it as doctrine, take it as information. The concept is great, the execution is pretty good and getting better. What WAR does is trying to measure value accurately, rather than based on old statistics that were invented before we had a good idea about what mattered in baseball. Check out our Stat Primer page to learn all about what stats are good and which aren’t so good.

And here’s a WAR calculator. Learn how to use it here.


53 responses

  1. […] but important stats out there is Wins Above Replacement (WAR). I wrote about WAR over a year ago here, so feel free to read that as a primer for what WAR is trying to measure. I also apologize if […]

  2. […] of those sets, for hitters and pitchers, then produce something called Wins Above Replacement (WAR)/calculator, which is a big part of the site. WAR, which you can read about at the link, is how many […]

  3. Attractive component of content. I just stumbled upon your blog and in accession capital to claim that I acquire
    in fact loved account your weblog posts. Any way I will be subscribing on your augment or even I fulfillment
    you access consistently rapidly.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: