Batting Average on Balls in Play, or BABIP, is a simple stat. It measures the percentage of plate appearances in which a ball put into play falls for a hit. Often times, BABIP is used as a proxy for luck. If very few balls happen to fall into play and the batter has a low BABIP, then the batter has been unlucky, and vice versa. But it’s obviously not so simple. How the ball is hit matters, as does the speed of the runner, as do half a dozen other factors. A year and a half ago, Peter Bendix and Chris Dutton attempted to drill down on these factors, examining things like batted ball data, speed, contact rate and park effect. The result was the creation of an expected batting average on balls in play, or xBABIP. Their research suggests that xBABIP was a very strong predictor of future performance. As they explained in their introductory article:
“The idea is to separate skill from variance. We’ve isolated a batter’s skill at getting hits on balls in play; therefore, we can assume that most deviation in BABIP from our model’s predicted BABIP is likely due to random fluctuation, and therefore unlikely to be repeated.
While our model cannot explain all of the variation in BABIP, we believe that it is an improvement over current explanations of BABIP, as it takes into account many factors that influence a hitter’s BABIP. By finding players who over- and under-performed their expected BABIP, we can further isolate skill from luck, and infer that players such as Mike Aviles are likely to regress and player such as Nick Swisher are likely to improve.”
Following the lead of Dave Golebiewski, who used xBABIP to analyze Curtis Granderson, I’m going to be examining several of the current Yankees’ performances thus far. The goal is to explain deviation from expectation and provide a reasonable baseline for future performance. Today I’ll start with Mark Teixeira. At the All-Star Break, Teixeira’s BABIP was .262. His career average is .305, some 40 points higher. What does the basic batted ball data have to say about this?
As this first chart directly above shows, Teixeira’s batted ball comp0nents are right in line with career averages. His line drive percentage, remarkably static over the past 5 years, is firmly at 20% and he’s registering the second-lowest ground-ball percentage in the past 5 years. Given that line drives and fly balls fall for hits 73% and 14% of the time in 2010, respectively, at first glance it appears that Teixeira should have a roughly similar BABIP, and batting line, that he has in the past. What’s keeping his BABIP down?
The culprit is obvious. His BABIP on fly balls is only barely below league average, but his BABIP on ground balls and on line drives is in both cases a solid 50 points below league average. Ground balls aren’t the best for building gaudy tripleslash lines, but line drives usually result in good things, particularly extra-base hits. So Teixeira’s BABIP appears to be held down by some fluctuation on his line drives and his ground balls. Should his expected BABIP be higher? xBABIP data is not readily available for 2009 and 2010, but by using Simple xBABIP calculator tool, we can calculate his expected BABIP for 2009 and 2010. Here are the results:
The Simple xBABIP Calculator isn’t as complex as the original work done on xBABIP by Bendix and Dutton (uses SBs to measure speed, rather than Bill James’ Speed Score), but it still does the trick. All indicators point to good news – Teixeira’s low BABIP is due for a correction. As I said in my Midseason Report Card, go ahead and trade for Teixeira in your fantasy league while you still can.