A couple times, now, I've messed around with numbers to perform some strike-zone analysis for pitchers. At the end of May and again on Thursday, I used numbers available at FanGraphs to try to identify which pitchers and pitching teams have been given the most favorable strike zones, and which pitchers and pitching teams have been given the least favorable strike zones. It stands to reason that there would be differences, and that these differences would probably be explained in large part (but not completely) by catcher pitch-framing. Good research has gone into pitch-framing before, and it seems like a pretty big deal. It's a subtle way to make a significant difference.
After looking at the pitching numbers, it only made sense to move on to the hitting numbers. Which hitters have gotten the most and least favorable strike zones? Here, you'd expect to see less variation. Hitters will go up against a roughly average sample of pitchers and catchers, including good pitch-framers and bad pitch-framers alike. It all wouldn't completely cancel out, but a lot of it would.
Anyway, there's interest in this data. It's more difficult data to interpret, but it's data worth having regardless. So let's get to it, using the same methodology that I used with pitchers. Pitch and strike information for hitters wasn't available at FanGraphs, so I pulled it from Baseball-Reference, where I didn't know it existed. The numbers below are for the 2012 season, with a plate-appearance minimum of 200.
Let's start with the hitters who've been given the least favorable strike zones so far, in terms of (strikes - expected strikes)/1000 pitches. Top ten:
1. Daniel Murphy, +29.0 extra strikes per 1000 pitches
2. Mike Aviles, +20.0
3. Drew Stubbs, +19.5
4. Will Middlebrooks, +18.8
5. Lucas Duda, +18.4
6. Omar Infante, +18.0
7. Travis Hafner, +16.8
8. Ruben Tejada, +16.4
9. Michael Brantley, +16.3
10. Cameron Maybin, +15.3
On the list, we find three different New York Mets. Plus two Boston Red Sox and two Cleveland Indians. What does it all mean? Who knows, really! Like I said, it's harder to explain this data than it is with pitchers. With pitchers, you can identify some probable causes. With hitters, it's more tricky. Let's now look at the hitters who've been given the most favorable strike zones so far. Top ten:
1. Marco Scutaro, -37.6 extra strikes per 1000 pitches
2. John Buck, -31.6
3. Austin Jackson, -30.6
4. Robert Andino, -29.9
5. Bobby Abreu, -29.3
6. Carlos Santana, -28.9
7. J.P. Arencibia, -26.3
8. Alberto Callaspo, -25.6
9. Michael Cuddyer, -25.3
10. Yoenis Cespedes, -24.4
As noted in a previous article, the overall league average is not even, but rather about -6 strikes per 1000 pitches. This is because there are differences between the PITCHf/x strike zone and the strike zones that actually get called by umpires. PITCHf/x strike zones are not perfectly set for each individual hitter. Human strike zones are also not perfectly set, and in fact vary from pitch to pitch. It's all very complicated and weird.
Scutaro stands out here, just as Murphy did on the opposite list. Let's compare their called strike zones, with images from Texas Leaguers. Murphy's zone:
And Scutaro's zone:
Complicating matters a little bit is that Scutaro bats righty while Murphy bats lefty. Also, Scutaro is fairly little, while Murphy is not so much. Those are the guys at the extremes.
And now we get to the team-level analysis. Here are all 30 teams, sorted in ascending order of favorability. In other words, the team at the top has been given the least favorable strike zone, and the team at the bottom has been given the most favorable strike zone.
Three individual Mets hitters found themselves in the top-ten of the least favorable strike zones, so it's not a surprise to find the Mets atop the list here. And at the other end are those Rockies, and as with everything Rockies you wonder if Coors Field at all plays some part in this. I don't know how it would, but parks can have all kinds of effects, only some of which are well understood. For example, Progressive Field increases groundballs. Why? Who knows!
This could be meaningful information. Alternatively, it could be practically meaningless information. With pitchers, the spread between the most and least favorable strike zones on the team level was about 36 strikes per 1000 pitches. With hitters, the spread is about 23 strikes per 1000 pitches. We expected the spread to be lower, because hitters aren't facing skewed pools of pitchers and catchers. This is much more about who's pitching and who's catching than it is about who's hitting.
But it's probably not not about who's hitting, to some degree. The hitter presumably matters a little. Are we capturing that here, or are we just capturing a meaningless spread because any pool of data will have a maximum and minimum? I don't know. I'll leave that up to you and your Friday afternoon brain, because no brain is more sharp than a Friday afternoon brain in the middle of summer.