Beards etc.: Sabermetrics

If you are a baseball fan or analyst who puts much stock in advanced sabermetrics (and/or the moneyball principle), this post is probably going to piss you off a little bit. It's partly inspired by a series of recent conversations on the topic, and it's really inspired by the 1-game playoff exit of the Oakland A's.

Now, for those of you who don't follow baseball too closely and who have never seen the (quite enjoyable) movie Moneyball, the A's GM Billy Beane has become the posterboy champion of sabermetrics in Major League Baseball. His tactic of crafting teams that focus on OBP and play the odds has taken the A's from a basketcase to a perennially competitive team, all without leaning too heavily on the franchise budget, so on the surface it seems hard to criticize his approach. I'm going to, though, for the simple reason that while the film (and therefore a lot of public perception) tries to treat him like a revolutionary hero getting crapped on by the establishment, the reality is that this season we saw yet again exactly why his method doesn't really work.

When I said the A's were competitive, I wasn't lying . . . but I was excluding a pretty big asterisk. The A's are competitive in the regular season. Their recent regular season records bear out the fact that 162 games is a big enough sample size to reward playing the averages. However, their stunning lack of playoff success in that same time points to a problem with how these teams are structured.

The problem is twofold. Firstly, while 162 is a lot of games, a playoff series is obviously a much smaller sample. In such a limited number of games, the important thing is to have a team that can handle individual, high-pressure situations. This is about managing as much as anything else, so I'll come back to it. The second problem, closely related to the first, is that sabermetricians tend to treat players as random number generators. They aren't computer programs, though: they are human beings with human strengths and weaknesses that come heavily into play, especially in the tense, high-stakes world of playoff baseball.

So, for that first problem, what's the difference between playing for 100 wins and playing for 1? Well, in the former case, individual errors of judgement may cause your total results to be less efficient, so playing by the book can be a good approach. Players with high OBP will provide a lot of opportunities over time to score runs. Left-handed pitchers will do better on average against left-handed hitters than will their right-handed counterparts. Making the statistically correct play will, over the course of a large sample size, work out in your favor more often than it fails. Given that the best record in baseball this year (which belonged to the Angels) came with 64 losses, obviously it's okay to have these things backfire and cost you the game occasionally. In the playoffs, though, you don't get the luxury of cold streaks and frequent losses. Every game matters, and you have to be aware of the value each at-bat and management decision brings. In the playoffs, a high OBP may not matter if the player never gets on base at a critical time. Sac bunts, the ability to go opposite-field on an outside pitch, the ability to make contact in a hit-and-run, and countless other factors can win or lose the game. Pulling out a pitcher who is in a groove so that you can match lefty vs lefty can have disastrous consequences. The team needs to be composed of players and a manager who all understand the nuances of these very specific situations and are able to react accordingly. The 60/40 success rate of simply playing the numbers isn't good enough.

How about that second problem? Well, I think it explains itself, but allow me to elaborate anyway. A random number generator simply provides numbers without any outside factors impacting it. You can follow trends in those numbers, and some people like to pretend that those trends are all that matters: essentially they think that if you put a lot of guys on base then you increase your odds of scoring. It works over a season, but come playoff time, if a GM puts together a team like that they are probably going to fail. Players aren't interchangeable parts of a machine. They are part of a group of people who have actual human relationships with one another. They have good days and bad days, friends, illnesses, roadblocks, and epiphanies. Team chemistry will effect how players feed off each other in rally situations. Mental toughness will effect how a player handles high-stress at-bats. Situational awareness can be the difference between a runner in scoring position and an out. Playoff experience, leadership ability, dedication, determination, positivity . . . the list of beneficial character traits a player might bring to the table can far outweigh their statistical contributions when the team is under the gun. Trying to suck the humanity of out baseball and turn it into a collection of graphs and numbers would be fine for video-game baseball, but in real life it misses the big picture.

Now, having a go at Billy Beane is fun and all, but since I'm already kind of on the subject, I'd like to redirect slightly and talk about two particular advanced sabermetrics which I absolutely loathe. Don't get me wrong, I have no issue with collecting this data, it's more the application in baseball analysis that I hate, but I'll explain that in a minute.

WAR. Wins Above Replacement is the current best-attempt at a single, universal stat that can express a player's value in one easy to see figure. I can see the appeal of such a notion, but I have some pretty big problems with it, too. Obviously, a lot of my issue here falls right back into the paragraph I wrote a minute ago about reducing human players to bar graphs, so I'm not going to repeat all that. Instead, I'm going to pick on three other issues I have with WAR. One is that its methods of calculation are highly suspect. If you simply take WAR at face value, you end up with some really strange results where painfully obviously inferior players end up with higher numbers on the basis of falling into a good spot in the lineup or playing a certain position. Those positional bonuses are particularly ludicrous, as they often reward or punish players to a disproportionate degree depending upon the position they play. Second, I strongly dislike the notion that this poorly-calibrated catch-all stat has become the new standard for discussing a player's real-world value. I wouldn't mind it simply coming up in the discussion like any other stat, but it receives far too much weight in my opinion, especially considering how sketchy its actual results can be. Third, and this is mostly semantic but it still bugs the shit out of me, the name for this stat is horrible. Other stats have names that realistically reflect what they are counting: strikeouts, home runs, hits, etc. This pretends like it counts the number of wins a player contributes over a mid-level replacement in the same position, but it does nothing of the sort. It is a rough calculation of overall value, not an actual reflection of how a player's presence effects their team's final record. For example, Andrew McCutchen's WAR this year was 6.4. It's a respectable figure as WAR goes, but it's in no way related to his impact on the team's record. He is the sole star the Pirates have, responsible for practically carrying his team to the playoffs for two consecutive seasons and providing the spark and figurehead around which his team has rallied. To pretend that his absence from the lineup, field, dugout, and clubhouse would cost the otherwise hapless Pirates a mere 6 games is beyond ludicrous. So yeah, WAR as a statistic is poorly formulated, overvalued, and misrepresentative. I'm not opposed to the concept as a point of statistical interest, but right now in its current state, WAR is a disaster.

BABIP. Batting Average on Balls In Play is probably the dumbest statistic in all of baseball. It's not dumb because of the numbers themselves: they tell an interesting story. It's dumb because of the way baseball analysts with a better understanding of spreadsheets than gameplay treat it. I have lost count of the number of articles I've read where some player (let's say Miguel Cabrera, since advanced sabermetricians seem to hate him for some reason) is accused of being more lucky than good because his BABIP is higher than the league average. This approach comes from the wildly misguided notion that all a batter can control is whether or not they make contact with the ball and that everything after putting it in play is a crapshoot. Well if you agree with that, I'd like you to do a little homework assignment for me. Go watch some gameplay footage and look at 40 or 50 Cabrera at-bats. Then do the same thing for, say, Elvis Andrus. Then come back here and tell me with a straight face that you honestly think Cabrera has a higher BABIP just because he's lucky. If you can do that, I'm sorry, but you are probably an idiot. When one player routinely crushes line-drives into the outfield and another player's hitting mostly consists of pop-ups and dribbling ground balls, of course the former will have a higher BABIP. It's not luck, it's the fact that he hits the ball a lot fucking harder. Of course, there will always be an element of luck, where a well-hit ball is practically gift-wrapped for the center fielder or a bloop single drops into shallow left just beyond the reach of the diving shortstop. By and large, though, players who hit the ball harder and cleaner tend to have a higher BABIP, yet it's often treated like it should come as some horrible asterisk on their season figures. The same basic thing affects pitchers who are good at inducing ground ball outs, as their low opposing BABIP figures are often stupidly held against them. So again, the stat itself is just fine and even kind of interesting, but its application is frequently beyond moronic.

There you have it: my long, semi-directionless rant against Billy Beane, moneyball, and advanced sabermetricians. I'm not saying there's no place in baseball for these things. What I am saying is that, like the New York Yankees, for me their biggest purpose is to be the villain in a game I love.

Wednesday, October 1, 2014

Sabermetrics

No comments:

Post a Comment