Tuesday, October 22, 2013

Defensive statistics are still confusing

I wish I was an analytic savant. I really do. For one thing, in this day and age of baseball writing, such ability is where the action is. I try. But like most, I depend on the major stat sites to tell me what I need to know even if I do not understand how they got there. And in most cases, things make sense. Miguel Cabrera was a better offensive player than Prince Fielder. Clayton Kershaw was a much more valuable pitcher than Bronson Arroyo. But fielding stats still baffle me because the major sites do not always agree with each other.

Oh, the two sites agree on Manny Machado and Andrelton Simmons. The major sites are in agreement within a few percentage points of each other. Both had historically good fielding seasons. The sites agree that Michael Young and Miguel Cabrera were terrible in the field. They differ more widely on the awfulness, but all sites confirm they were bad.

But there are too many others that are so off the charts different that it leaves us neophytes gasping at straws. I want to give you a few examples. But first, the common disclaimer must be made. Just about everyone who is smarter than me says that you have to look at a player's fielding over a three year period to get a fairer view of a player's fielding ability. Don't ask me why. But that is what I hear all the time. And of course, the natural disclaimer that I don't always know what I am talking about.

Let's start with Brett Gardner. As a left fielder in 2011, he was rated highly, so highly in fact that it turned his rather mundane offensive numbers into a pretty good valuation. Gardner lost most of 2012 to injury and then this year took over center field--a position that most thought he was most suited for. So how did Gardner do in center in 2013. Darned if I know.

Fangraphs.com rated him at 1.7 runs above average on defense for eighth best among qualifying center fielders. Baseball Prospectus rated him at seven runs below average for his defense. But then you get to Baseball-reference.com.

According to that site, Gardner cost the Yankees more runs on defense than any other player in baseball. According to them, he was 20 total zone runs below average, the most negative number they gave anybody! How can three sites differ so much?

Well, part of that is how the information is displayed that is not necessarily the bottom line. When Fangraphs.com puts that fielding runs next to a player on that player's dashboard, the number includes positional value and replacement value. For example, if you go to the Value section of Gardner's page there, you will find that they rated him at five runs below average, which is close to what Baseball Prospectus gave him. But that -0.5 was offset by a 1.8 positional value which brings him up to a 1.3.

Baseball-reference.com gives him a -20 total zone runs below average, but if you go to where they do their valuations, they give him six runs above average for his fielding and when they add in the positional scarcity equation, he is given 1.1 dWAR based on 18 runs above average. Are you as confused as me yet?

So despite B-R giving him the highest negative on total zone runs in baseball, his defensive valuation is actually higher on their bottom line and as such, they give him 4.2 rWAR compared to Fangraphs' 3.2 fWAR. He received 2.5 WARP from Baseball Prospectus. That is because B-R uses two different fielding systems. One is total zone runs and the other is BIS. More on this in the next paragraph.

Let's take another example. I wanted to see how Shin-Soo Choo made out this year since it was quite a story before the season started that the Reds would make him their every day center fielder. This gets to be quite confusing. Baseball-reference.com actually lists two sources for their fielding stats. One is total zone runs (it has a longer name, but that will do) which is provided by BaseballProjection.com and the other is BIS defensive runs saved above average, which is provided by Baseball Info Solutions.

Choo is given +13 runs above average in total zone runs, the 21st highest in baseball in 2013! But BIS has him at -18 runs saved below average. Choo is given -13.3 fielding runs by Fangraphs and -3.3 by Baseball Prospectus. But again, like Gardner, we need to dig down to the bottom line.

Let's start with Fangraphs. In that site's valuation section, he is given -15.5 for his fielding runs above average. He gets a 1.8 positional adjustment which gives him a total of -13.3 fielding score. Over at Baseball-reference, he get a +13 for his total zone runs but a -18 for his BIS score. In that site's valuation section, they use the -18 number and then add in three for his positional scarcity but still give him a 1.8 dWAR. Baseball Prospectus gives him a defensive score of -3.2. Yes, that hissing sound is my head about to explode.

The bottom line or WAR score for Choo goes like this: 5.2 fWAR, 4.2 rWAR and 6.1 WARP. Ugh.

Let's do one more: Norichika Aoki of the Brewers. Baseball-reference.com gives him a total zone runs of 26, the third highest of anyone! And his BIS score is 13, or half of the other score. Fangraphs gives him a -3.5. Uh. Baseball Prospectus gave him a -3.8. So at least two out of three agree.

Drilling down again to the value sections, Fangraphs actually lists his fielding score at a positive 3.2, but since he played a corner outfield spot, he gets a lot of negative positional hit to bring him down to the -3.5 score and an fWAR of 1.7.

Baseball-reference.com gives him a positive 13 number for his fielding (it seems obvious they use the BIS score) and after knocking him some for his positional scarcity, he ends up with a +0.6 dWAR and a 3.0 rWAR. Baseball Prospectus knocks him all the way back to 1 WARP. With a win valued around $5 million (1.7 fWAR = $8.7 mil in value), that is a $10 million swing in the bottom lines.

I am not doing all this to criticize these sites. How can you criticize when you have no idea how the analysis is done? All I am saying is that the uneducated writer like me who tries his darned best to use these metrics, the slope is slippery and very, very confusing.

No comments: