Wins Above Replacement (WAR) Calculator
One of the missions of New English D is to make sabermetrics more easily digestible. To that end, we have an extensive Stat Primer series that explains a lot of important concepts and stats. One of the most controversial, but important stats out there is Wins Above Replacement (WAR). I wrote about WAR over a year ago here, so feel free to read that as a primer for what WAR is trying to measure. I also apologize if it’s not perfectly written, as it was one of the first pieces I put out there and was still working to develop some expert internet-writing skills.
Calculate FIP, xFIP, and wOBA too!
That said, people often complain that WAR is really too complicated and it doesn’t make sense to them. Well, I’m going to give you the tools to calculate WAR right here and right now. I will make two quick points.
- This is an approximation of the FanGraphs version of WAR for position players, not pitchers.
- This is not perfect, mostly in the sense that it does not account for park factors. If your fake player plays in a hitter friendly park, this number will be too high. If they play in a pitcher friendly park, it will be too low. Not a huge amount, but some. This calculator isn’t perfect because in order for it to be perfect, I would have to ask you to input way more information and I would have to learn how to be a much better coder.
- This doesn’t account for league, which makes a small difference and it doesn’t break down by number of games played at different positions if your player plays more than one.
Here is how it works. Fill in the data from Cell B2 to Cell B11 with the basic statistics of your player. In Cell B17, type the number that corresponds with their position in the “H” column. In Cell B18, type the number of runs above or below average you player is on defense. A perfectly average defender at the position in question will be zero. Remember these are run values and generally range from -10 to +10. Do the same thing with baserunning runs in Cell B19. If you want more information on any of these numbers, visit our Stat Primer page for details.
If you’ve done everything correctly, you should have a WAR value in B20. Remember, this doesn’t adjust for park or league, so it won’t be perfect, but it should give you a pretty nice idea if you’re just looking to play around with some numbers. Essentially, this is a “what if WAR machine.”
Right now, these numbers reflect the 2013 regular season. Feel free to play around with the numbers in the blue box if you wish to calculate based on different seasons. All of the numbers can be found on this FG page with the exception of Lg R and Lg PA, which are simply the total number of runs and plate appearances in the league that season.
Enjoy and feel free to post if you catch any mistakes. This one was much harder to write than FIP, xFIP, and wOBA.
Fangraphs and Baseball Reference Unify Replacement Level
Today, Fangraphs and Baseball Reference consummated a relationship we knew to be coming for the last few weeks. While the two sites have always calculated Wins Above Replacement (WAR) differently, they decided to discuss reworking a component of the metric. That component was replacement level, defined as the production of a player who is readily available as a minor league free agent or on the waiver wire.
Today it happened. Dave Cameron can give you all the specifics over at Fangraphs, and I can’t say I disagree with any of the changes. I like that the two leading sites are working to improve WAR and our overall statistical evaluation of baseball. This is a step in the right direction and it’s good for everyone involved.
But there is a weird result from today’s unveiling of the new replacement level that is freaking me the hell out.
Everyone’s WAR is slightly different than it was yesterday.
Now many who hate sabermetrics might use this as a point of assault, but those people who know better know that it’s just a shifting baseline calculation that marginally changes the precise point value of WAR. The substantive results are the same, just refined.
But for someone who reviews baseball statistics quite religiously, it’s trippy. For example, Justin Verlander gained 0.2 WAR for 2012. Buster Posey lost 0.4 WAR. Most of the exact changes are pretty small and don’t change the interpretation much, but when we’re dealing with something like WAR that is imprecise and on a relatively small scale, things get funky. A bunch of players shifted places in rankings. Not dramatically, mind you, just from 2nd to 3rd or 8th to 7th. It’s minor and doesn’t mean much, it’s just weird.
I woke up today and the past had changed. I mean, I know that isn’t true, but it seems like it. Justin Verlander was the best pitcher by WAR last season, but now he is the best by more. Perhaps this means nothing to anyone else, but it was interesting for me.
WAR got better today and given the people in charge of its design, it will continue to get better into the future. Let’s just hope I’m better prepared to cope next time and don’t spend an hour of my day staring at my computer repeating “this is weird” to myself.
But seriously, it was.
Stat of the Week: Wins Above Replacement (WAR)
Let’s get this out of the way early because it’s going to come up in a couple weeks when
SABR Toothed Tigers New English D hands out a very controversial MVP award to someone not named Miguel Cabrera.
This week’s Stat of the Week is Wins Above Replacement (WAR). It’s been in the news a bit lately, so let’s all get on the same page about it.
First, there is something you should know. There are two different WAR. One belongs to Baseball Reference, one belongs to Fangraphs. They are different primarily because they use different measures of defense (more on this later). I will always cite Fangraphs on this site, but only for the reason that I like how they present their data.
The concept behind WAR is the same for both sites. How many wins does a given player add to his team above what a replacement level player would? A replacement level player is defined as a widely available AAA type player. Think Mike Hessman, Jeff Larish types if you followed Tigers minor leaguers in the last decade.
This is a pretty simple idea. What is the difference in wins between Prince Fielder at 1B versus Jeff Larish at 1B if everything else were equal? That is WAR in the abstract.
More concretely, a team that only played replacement level players would win about 50 games per season. As we’ve mentioned a lot here, even terrible teams win sometimes.
So what does WAR look like? For position players, you want to post at least a 2.0 WAR in a season to be considered a “starter.” Below that and you’re a backup or a minor leaguer. 2.0-4.0 WAR is considered solid, 4.0-6.0 is pushing All-Star to superstar levels, and 6.0+ is MVP type guys. You can roughly use the same scale for starting pitchers. Relievers are much different because they play so much less. Better than 1.0 WAR for a reliever is good, 2.0 is great, and 3.0 is excellent.
Let’s talk theory first. The common retort to this is that “Miguel Cabrera has to be worth more than seven wins to the Tigers! If you took him out, they’d suck!”
This isn’t really accurate. Think about it. The Tigers won 88 games, Cabrera posted a 7.1 WAR. Let’s round up to 8.0 WAR to be generous as that difference is attributed to what WAR considers poor defense (more on this later). If the Tigers did not have Cabrera and replaced him with a minor league player ala Ryan Strieby, the Tigers would go 80-82 according to WAR theory. That’s actually pretty realistic if you just look at it. That’s 9% of their wins concentrated in 4% of their roster.
80-82 isn’t very good, but it’s not horrible. After all, they Tigers have a good team around him. Let’s take away Verlander’s 6.8 WAR (rounding up to 7) and we’re at 73-89. Good for fourth worst in the AL. Essentially, if we take Verlander and Cabrera off the Tigers according to WAR, they would only be better than Cleveland, Minnesota, and Boston in 2012.
You have to buy that. They still have Fister, Scherzer, Sanchez, Fielder, Jackson, etc. They would be much worse, but still not a minor league team. Take away Fielder’s 4.9 WAR and we’re down to 68-94. Only the Twins were worse. That sounds about right when you really think about it.
So that’s the theory, but what about the practice. How do we calculate WAR? What WAR seeks to do is combine hitting, baserunning, and defense into a single number calibrated to the only thing we actually care about, wins. Each action earns a “run value” based on how often that action contributes to run scoring and the accumulation of 10 runs is about equal to 1 win.
WAR takes into account how much better than average a player is offensively using wOBA and coverts it into an overall run value, wRAA, based on the number of plate appearances a player has had. You take that wRAA and divide it by the Run to Win value of that year (usually about 9 to 10). That gives you offensive WAR. Baserunning has a similar type formula based on how many bases you take and how many you steal. Defense is based on UZR for FanGraphs and DRS for Baseball-Reference, which all come out in run values converted into wins in the same way. Overall WAR is also adjusted for the position you play.
For pitchers, FanGraphs uses Fielding Independent Pitching (FIP) and includes the number of innings pitched, park effect, and similar adjustments and Baseball Reference uses runs allowed and controls for the quality of your defense.
Simply put, WAR is trying to measure the total contribution a player makes with his play on the field. It obviously doesn’t measure things like leadership that reflect on other players (or moving to a new position!), but everything they do on the field is captured. Surely no one can challenge this concept.
WAR takes various statistics and combines them and scales them to churn out a number. The math is based on baseball history and what has been shown contributes to winning. For example, WAR values OBP over AVG because walks are important, but missing from average. It doesn’t care so much about RBIs because you can’t drive in runs if no one gets on base ahead of you. The math behind this, which I won’t subject you to any more unless you really want me to (here’s Fangraph’s page on WAR), is rooted in the game’s history and they adjust it every year to pick up new information, but it’s always scaled to that year’s replacement level so you can compare across eras.
WAR is not a perfect, exact measure of a player’s value, but it is a good one if you sum a team’s WAR and compare it to their actual win total + 50 (again this number has been slightly adjusted). It’s not a be all end all. If a player is 4.6 WAR and another is 4.5 WAR, they are essentially equal. There is margin of error. But WAR does give you a good sense of how much this player helps his team win with his own performance.
The argument against WAR is twofold. First, it’s complicated. It turns out a pretty good number, but it’s hard to grasp. You can’t watch a game and immediately see how Player X’s WAR is impacted like you can with HRs or average or walks. It’s not a stat, it’s a metric. It weighs the value of each action based on how those actions normally lead to wins for your team. So it’s hard to follow. You have to look at the numbers, you can’t figure them out and follow them as well. I’m not arguing we throw the others out in favor of WAR, but when you want to compare players who player different positions and on different teams, WAR equalizes that through a positional adjustment and other devices.
The other problem with WAR is defense. Defense is really hard to measure. Fielding percentage is not a good measure because that only tells me how often you make errors, it doesn’t tell me what kind of errors. It doesn’t tell me about your range.
WAR uses UZR (Ultimate Zone Rating) or DRS (Defensive Runs Saved) to measure defense. They are both metrics based on range and execution, with human viewers judging every play based on if it should be made routinely and how much harder or easier a play is from average difficulty. So there’s some subjectivity, but it’s much better than any of the traditional numbers. Plenty of people criticize these numbers because they fluctuate a lot and give some weird results on occasion.
Essentially, defense is WAR’s weak leg, but it’s getting better and is much better than anything traditional. But this means we can’t use WAR as a final word. We have to look at other things and use our eyes.
Don’t run from WAR because it is complex math. You can check it yourself by seeing what it turns out in a given year compare to a team’s actual results. The Tigers position player WAR and pitcher WAR sums to 43.9 this season. 50 + 43.9 = 93.9 which misses the Tigers win total by a whopping 6%. Not bad. It works even better with bigger samples.
Question WAR because it may be imperfect. It’s not trying to boil baseball down to a spreadsheet, it’s trying to correctly value on field actions. Having more RBI doesn’t make you a better player than someone else. Hitting more homeruns doesn’t either. If no one gets on base for you, you can’t drive them in. If you play in San Diego instead of Cincinnati, you’ll hit fewer bombs.
WAR is an equalizer. It allows us to compare individuals playing a team game. It’s a good thing. Don’t take it as doctrine, take it as information. The concept is great, the execution is pretty good and getting better. What WAR does is trying to measure value accurately, rather than based on old statistics that were invented before we had a good idea about what mattered in baseball. Check out our Stat Primer page to learn all about what stats are good and which aren’t so good.
And here’s a WAR calculator. Learn how to use it here.