
This is a guest post from friend of the blog Jamal Granger. It is a meticulous piece of research and we are proud to be running it here at TYU.
Endless thanks to Eric Seidman of Baseball Prospectus, who devoted his valuable time to supplying with me with the essential data for this post, and introduced me to the wonders of SQL (though, as I begin to immerse myself, I question whether “thanks” is the appropriate term …).
The 1975 Cincinnati Reds were the topic of a recently published novel by celebrated sports journalist Joe Posnanski. In the book, titled The Machine: The Story of the 1975 Cincinnati Reds, Posnanski “… captures all of the passion and tension, drama and glory of this extraordinary team considered to be one of the greatest ever to take the field,” says Amazon.com; however, based on a recent discussion that Mike Francesa had with his listeners on his radio show – Mike’d Up – about the greatest infield-plus-catcher units in baseball history, I decided to take a statistical look at things and discovered how the ’75 Reds arguably boasted the greatest quintet of players to ever take the baseball diamond.
Using weighted Equivalent Average (EqA), total Equivalent Runs (EqR) and Rally’s Wins Above Replacement (WAR) data that dates as far back as 1969 for the former two, a likely indubitable argument can be made that Hall of Famers Johnny Bench, Joe Morgan, Tony Perez, and All-Stars Dave Concepcion and Pete Rose combined to not only lead their 20 teammates to a 108-win season and a World Series victory over the Boston Red Sox but, statistically, became the greatest infield-plus-catcher unit, or Diamond Unit, in the past four decades.
While the aforementioned Reds squad may very well be the greatest Diamond Unit in the past forty years, arguments can be made for almost a handful of other teams. If you go by EqA, the 2009 Yankees are the best; EqR says that the 1974 Reds – with third basemen Dan Driessen replacing Pete Rose of the ’75 team – beats the bunch; Rally’s WAR has the ’75 version of The Big Red Machine as the alpha dog since 1969. While your opinions may vastly differ from mine, I say that the 1975 Reds are the top unit because WAR factors in all aspects of a player’s production – which is something that EqA and EqR do not.
By WAR, here is the leader board for the best Diamond Units since 1969:
The Year of the Green Wood Rabbit: The 1975 Cincinnati Reds – Morgan’s Magnificence
The 1975 Cincinnati Reds – led by a 12-win season by second basemen Joe Morgan – hit to the tune of a .305 EqA and 504.9 EqR, and produced a grand total of 29.4 WAR, a full three wins above the next closest quintet, the 1976 Reds. Morgan, posting career highs in batting average (.327), stolen bases (67, tied with his ’73 mark), on-base percentage (.466), weighted Runs Created (138.2) and wOBP (.463), was the near-unanimous winner for the first of consecutive NL MVP awards (Charlie Hustle stole two votes), and actually stole more bases (67) than he struck out (52). Also, not only did Morgan’s .360 EqA and 136.9 EqR pace the majors, the next closest qualifier (at least 300 plate appearances) for EqA was the Royals’ John Mayberry (.329).
Following Morgan’s stupefying campaign, Hall of Fame backstop Johnny Bench produced an astounding 6.5-win season, which, amazingly enough, is just the fourth-highest mark of his career. Bench, a MVP candidate in any other year (well, more on that later), did not produce any career-high marks but was part of a tremendous offensive trio of catchers that included Oakland’s Gene Tenace (.316 EqA and 107.4 EqR; why is he not in the Hall?) and St. Louis’s Ted Simmons (.311 and 106.4; another questionable HOF exclusion). Although Bench’s .308 EqA trailed both Tenace and Simmons for the lead amongst MLB catchers, he trailed only Joe Morgan for the team lead in what made a devastating two-three combo in Cincinnati’s lineup.
Pete Rose put up a .317/.406/.432 vital in 1975 and his 4.4-win season was just a stepping stone in a 12-year period from 1965-1976 that saw him produce at least four wins above replacement in every season but his 3.6-win campaign in 1970. Rose, known for his trademark hustle on the base paths, produce just two runs above replacement in that regard; and it makes you wonder: how much of that storied hustle actually helped his teams instead of just showing a lot of heart? Earning All-Star and Gold Glove (Total Zone had him as ten runs below replacement, but whatever) honors in 1975, Mr. Hustle was the lone National League player earn any first-place votes in the MVP race, as teammate Joe Morgan deservedly ran away with the title.
In terms of his non-offensive production, Dave Concepcion was a stalwart – his base running and defense made him produce to a level approaching that of a league-average player (17 RAR). However, Concepcion came to the plate 762 times in 1975, and as his .257 EqA and 64.5 EqR will tell you, he was a below-average hitter in every sense of the term. The beauty of analysis is that everything is relative, and in Concepcion’s case, he was among a group of shortstops (Larry Bowa of the Phillies; Bert Campaneris of the Athletics; Chris Speier of the Giants) that could lay claim to being the best offensive performers of that position in the non-Toby Harrah (.398 wOBA) division.
After enjoying a six-year stretch from 1968-1973 in which his WAR ranged from 4.2 to 6.7, Tony Perez’s 1975 campaign saw him deliver a 3.1-win campaign as the weakest link of the Machine’s Diamond Unit. Although this was in the midst of quite a prolonged decline phase, Perez’s 83.7 EqR and .288 EqA placed him in the top 33 percentile in an environment that saw the Royals’ Mayberry pace the field with a .329 EqA, 124.9 EqR and a robust .427 wOBA.
John Walsh of THT did an interesting study recently in which he looked at trends in OBP at the leadoff spot over time. What he found was a bit strange:

As the data shows, teams have been placing players with below average on-base skills in the leadoff spot for much of the last decade. This exhibits a failure to properly optimize the lineup, as it tends to result in fewer runners being on base for a club’s big hitters. Joe Pawlikowski at RAB touched on the issue of optimizing the top of the lineup yesterday in explaining why Nick Johnson should hit second:
To illustrate this point, let’s take an ideal scenario. Jeter and Johnson both hit in front of Teixeira for all of Teixeira’s plate appearances, and they OBP somewhere around their 2009 totals, .400 and .420. Running a quick percentage check, this means that Teixeira would come to bat with both runners on 16.8 percent of the time, and at least one runner on about 65 percent of the time. Given Teixeira’s 707 plate appearances from 2009, that means he’d come to bat with at least one runner on 460 times, and two runners on 119 times…..
Last year, with Jeter’s .400 OBP and Damon’s .365, Teixeira had a 14.6 percent chance of coming to the plate with both runners on, or 62 percent with at least one runner on…..If Granderson recovers to his 2008 form, he’s essentially a clone of Damon. While that’s good, and while he’ll be able to take extra bases that Johnson will not, I think that the added plate appearances give the Yankees a bigger advantage. It means more opportunities for Tex and A-Rod.
To sum up, Johnson batting second means more opportunities with runners on for Teixeira and Rodriguez. The Yankees need to keep this in mind and avoid the problem Walsh discusses in his study, whereby teams are placing fast players who do not reach base frequently in lineup slots ahead of their big boppers. Rather, they should stack as many high-OBP players in front of Tex and A-Rod as possible. In fact, Dave Pinto suggested that the Yankees should consider batting Johnson 9th as a second leadoff man. This would allow Johnson and Jeter to reach base for power hitters such as Granderson (who would hit second), Tex, and A-Rod. A similar option would be to put Nick Swisher or Granderson 9th and keeping Johnson at #2, which might be a good way to further optimize the lineup and provide as many opportunities as possible for the middle of the order hitters to bat with men on base.
How would you optimize the lineup?
The headline is an obvious statement, but I had yet to see an actual number put on the gap between starting and relieving until now. Tom Tango said the following:
The replacement level pitcher as a starter has a .380 win%. Move that starter to relief, and his win% goes up by about .09, or .470 win%. That’s it.
The average starter has a win% of .490 and the average reliever has a win% of .520 (more or less, and by win% I mean based on his pythag component ERA). As you can see, the average reliever is not that much better than the replacement-level pitcher as reliever. That’s why we say relievers are a dime a dozen. So, the average starter is +.11 wins per 9 IP and he uses up two-third of the innings. The average reliever is +.05 wins per 9 IP and he uses up one-third of the innings. If you follow along, the average starter gives you twice the value, per inning, as the reliever, and he gives you twice the innings. That sets the value of the average reliever of 25% of the average starter (1/2 times 1/2). This number goes up a little when you add in the leverage impact of relievers.
When people bring up Joba Chamberlain and suggest he belongs in the bullpen, I frequently explain that starters are significantly more valuable than relievers, such that it makes sense to give him every chance to succeed out of the rotation. Even if Joba is a top reliever and simply an average starter, his value is almost certainly going to be greater taking the ball every five days. Unless he tanks entirely in that role, the “bull in a china shop mentality” and all of that psycho-babble garbage that gets spewed to support moving him to the pen should be viewed as largely irrelevant. The job of the team is to extract as much value as possible from Joba, and having him in the rotation is the best way to do so.
Matt wrote an excellent post this morning about bringing statistics into the mainstream, and I think Chris began to follow through on that with his fascinating post on breaking down UZR. Both posts illustrated that fans now have more information at their hands than ever before, and that we can educate ourselves about the very essentials of the game. However, an interview that I heard this morning on WEEI, with Theo Epstein, reminded me that as fans, we still do not have all of the information:
I think that he (Ellsbury) is an above-average center fielder now, who is going to be a great center fielder. I know there is a certain number we don’t use that is accessible to people online that had him as one of the worst defensive center fielders in baseball last year. I don’t think it’s worth anything. I don’t think that number is legitimate. We do our own stuff and it showed that he is above average.
I think Theo is posturing here a bit, as every single method available to fans for statistically evaluating defense had Jacoby as a poor centerfielder in 2009. He is likely protecting his player and avoiding the talk radio firestorm that would ensue if he called Ellsbury a poor defender. That said, this did bring into focus the fact that clubs do have proprietary systems to determine player value, such that fans do not have equal information to that of the clubs. While proprietary does not necessarily mean better, these clubs have been attempting to hire those at the cutting edge of the industry, such that you would expect them to be at least slightly ahead of the field.
What does this mean? Put simply, it means that the numbers that we use as fans are imperfect, and should be utilized with that in mind. That does not mean that we should not use those numbers to craft our arguments, or that conclusions based on those numbers are faulty. Rather, when the numbers provide shades of grey, it is important to note that they are likely inexact and far from absolute. Furthermore, because the data is imperfect, subjective judgments and evaluations of players should have a place in the discussion. We can argue about how large that place should be, and I would say that it should be minor, but visual observation can occasionally pick up on nuance that is lost in the statistical breakdown.
I recently had the opportunity to talk to the GM of a team that uses sabermetrics extensively, and he told me that the gap between the information that the clubs have and that which the fans have is rapidly closing. That said, the data that the clubs use is far from perfect in of itself, and the information available to us is certainly no better. We need to be prudent in how we use these numbers, and be careful not to depend on them past their level of reliability. If we do, we become just as ignorant as those who choose to deride sabermetrics.
This morning, John Sickels posted an article in which he suggested that sabermetric analysis has become too granular to be interesting and fresh:
The newest stuff is becoming so granular that I’m having problems making sense of it. I’m a humanities guy, and the most advanced math is beyond my ability to completely comprehend. My personal opinion is that the many of the newest metrics (at least in regards to hitting and pitching) are just more complicated ways to say the same basic truths…..
But I’m finding that as I read the most advanced sabermetric stuff regarding major league players, my eyes glaze over and I start to get the grad school feeling again: why am I reading this? I’m not enjoying it. I want to watch a baseball game.
So am I just entering my dotage prematurely? Or is advanced sabermetric analysis becoming so specialized that no one but physics and math majors can understand it, leaving us humanities majors behind, let alone the average fan? If that is true, what can be done about it? I don’t mean stopping research; obviously it needs to go forward. But I mean, how do we find ways to disseminate the new knowledge and make it comprehensible for the non-math folks among us? How do we integrate and explain the new knowledge?
This article has garnered plenty of interest in the sabermetric community, with two writers at THT responding. First, Pat Andriola:
So when you say that they are “more complicated ways to say the same basic truths,” you are, to an extent, 100% correct. However, the questions that remain are: 1) how much an improvement are we gaining over the basic truths and 2) how valuable are those marginal improvements? Maybe you find these advances boring and trite, but many others (such as myself) don’t. I’m sure there are front offices and analysts that clamor over the newest posts at Fangraphs and The Hardball Times, just like I’m sure you find the latest breakdown of a hot prospect’s swing riveting. These are, ultimately, questions of what gives us the most utility (or satisfaction), and are completely subjective.
Pat is right on the money here, as I have spoken to a number of people within front offices, including one GM, who said that they follow Fangraphs and THT religiously, attempting to get an edge in data analysis and evaluation. These teams find these marginal improvements important, hoping that they provide even the slightest edge over their competition. If the clubs find this sort of analysis important, then it makes sense for an interested fan to be interested as well.
The second article, from Dan Novick, does a fantastic job addressing the idea that sabermetric analysis is boring and too technical:
Baseball writing on the internet is a meritocracy. Sabermetrics isn’t spreading because we say it is. It is spreading because there is an increasing number of fans that find it useful. There is no such thing as “required reading.” If you don’t find a particular aspect of sabermetrics useful anymore, there won’t be any negative repercussions should you choose not to read it.
I could not have said it better myself. If you are a Yankee fan and do not like sabermetrics, you can skip over that sort of article here or at RAB, or ignore those sites entirely. There are so many options and forums for discussion that a fan could likely stick to the most basic of sabermetric precepts and still find a place where he or she can have a reasonable discussion about the sport, and have a fairly decent understanding of value and related concepts. If you are a creator of content, you can ignore sabermetrics as well, and cater to a less stat-obsessed crowd. No one is being forced to use sabermetrics. If you do not like them, just ignore them. It really is that simple.
Sickels is not “anti-stat,” and I doubt that he would suggest people ignore sabermetrics entirely. He was simply raising a reasonable point. Do you agree with him?
Hey all; I’ve put together a fantasy baseball league for the writers and readers of TYU and I’d love for you to join. Right now, E.J. and I are in and the maximum for the league is 12. The draft is currently scheduled for March 15th (Monday). It’s a league from Yahoo! and the ID# is 171636 and the password is simply “tyu” without the quotes.
The hitters’ categories are: R, RBI, HR, AVG/OBP/SLG/OPS, SB%.
The pitchers’ categories are: IP, W, L, S, K, ERA, WHIP, K/BB, BS.
If you’re interested, join on up, we’d love to have you!
While we’re on the subject, I’d like to talk about fantasy baseball. Some people may think that Fantasy Baseball is more of a detriment to the game, that it takes people away from the “reality” of the game and puts the focus on the numbers rather than on the players. There is nothing farther from the truth in my experience. Fantasy Baseball has done a ton to help me get even more into baseball. It makes me research players and try to learn something about players and teams that aren’t the Yankees. Because of Fantasy Baseball, I watch MLB.tv to see other players perform. The game requires that the owner of the team find something out about a variety of players and get a more “global” view of Major League Baseball, instead of keeping a focus on his or her own team.
Fantasy Baseball also helped get me more into the analytical side of the game. Through Fantasy Baseball, I became more and more interested in the world of sabermetrics. Curious, I walked down that path and I’m glad that I did. Some may argue that this gets me farther away from the game, but I obviously disagree. To me, the numbers tell the story of every game in incredible detail. I can see exactly what some player did at exactly a certain time. I can match that numerical story to the one I experienced when watching, listening to, or attending the game. The numbers make the story of the game complete; they fill in the blanks. Since getting deeper and deeper into Fantasy Baseball and advanced baseball metrics, my love for the game has only grown.
Fantasy Baseball has made me a more educated fan who’s been able to experience the wonderful game of baseball from a variety of angles. It has made me become even more engrossed in my favorite sport, favorite team, and favorite players. I’d recommend it to anyone who wants to get immersed in the sport. So, please, join up and enjoy.
With the announcement coming down just last night that Fangraphs has added splits to their stat pages, I thought it would be fun to look at interesting 2009 splits for each likely member of the 2010 Yankees. I looked at hitters this morning, and will now address starting pitchers, with relievers to follow at some point tomorrow. I will likely expand on some of these over the next few weeks. Remember, when you do splits, you are essentially splitting the sample, such that small sample size caveats apply.
CC Sabathia
FIP v. L: 2.43
FIP v. R: 3.69
CC against righties is a very good pitcher, but likely not a Cy Young candidate. His dominance against lefties is what makes him such a dangerous weapon. Much of the difference in performance comes from his significantly better K-rate against left handed batters (9.94 v. 7.02). I think it is interesting to note that CC would be a well above average pitcher even if he only faced righties.
AJ Burnett
Bases Empty: WHIP 1.63, BABIP .348, FIP 4.62
Men On Base: WHIP 1.16, BABIP .250, FIP 4.03
AJ was significantly better once runners were on than he was with the bases empty, apparently buckling down once he got into trouble. However, as the BABIP suggests, he was quite unlucky with the bases empty and was very lucky once men reached. If both issues correct themselves, he should be slightly worse with runners on but will face fewer such situations due to an improvement with the bases empty.
Andy Pettitte
Home FIP: 4.67
Road FIP: 3.59
Pettitte had some major problems pitching in the new stadium, a fact that is reflected in his results. This is despite the fact that as a left-hander, he should have the tools to partially neutralize the effects of the ballpark. He gave up more line drives and more flyballs on the road, but significantly more of the fly balls allowed at home left the park (13.2% v. 5.1%).
Javier Vazquez
Bases Empty: 1.61 K/BB, .67 HR/9, 1.06 WHIP, 2.40 FIP
Men on Base: 2.12 K/BB, 1.06 HR/9, .98 WHIP, 3.34 FIP
As we have discussed at numerous points this offseason, Vazquez has, for much of his career, had difficulty pitching from the stretch. It is fascinating to note that his 2009 WHIP was lower with men on. However, is control seemed to get worse in those situations, and he gives up a lot more homers in those spots. Basically, Vazquez gives up his biggest blows with runners on base, which is why his ERA is usually worse than his FIP.
Joba Chamberlain
BABIP by Month (LD% in parenthesis)
Apr. .295 (23.9)
May .371 (25.5)
June .290 (15.5)
July .269 (23.6)
Aug. .374 (23.1)
Sep. .348 (19.3)
Rob at BBD did a study on Chamberlain’s velocity today, and found that his terrible August and September numbers could not be attributed to a loss in velocity. One possible explanation is what you see above. Joba’s BABIP in those two months was sky high, and could not be entirely explained by his LD%. It is possible that Joba was simply unlucky down the stretch.
With the announcement coming down just last night that Fangraphs has added splits to their stat pages, I thought it would be fun to look at interesting 2009 splits for each likely member of the 2010 Yankees. I will look at hitters now, and address pitchers later today. I will likely expand on some of these over the next few weeks. Remember, when you do splits, you are essentially splitting the sample, such that small sample size caveats apply.
Jorge Posada
Home: wRC+ : 167
Away: wRC+: 101
For those that are not aware, wRC+ is the Fangraphs version of OPS+, and is likely a better measure because it corrects the OBP/SLG weighting problem inherent to OPS. Regarding Posada, I was surprised to see how stark his home-road splits were, considering that he is a switch hitter and is not a dead pull hitter. He certainly made use of the short porch, notching a 271 wRC+ when batting as a lefty and hitting the ball to right field.
Mark Teixeira
Grounders: .187/.187/.214
Fly Balls: .327/.320/.991
Liners: .747/.747/.939
According to Fangraphs, league average in these categories:
Grounders: .231/.231/.253
Flies: .217/.212/.602
Liners: .727/.723/.974
Teixeira did significantly better than average on flies and worse than average on grounders. The ground ball data suggests he needs to keep the ball in the air, but I wonder about the flyball data. It may be possible that shots that would qualify as liners in other parks are being ruled flies when they clear the wall in Yankee Stadium, such that much of his power is being shifted from the liner category to the fly ball category.
Robinson Cano
Low Lvg. FB% 30.2
Med Lvg. FB% 34.1
High Lvg. FB% 48.1
The more important the situation, the more likely Robbie was to hit a fly ball. This strengthens my belief that he is trying to do too much in those spots. It is important to note that players only have 60-80 high leverage at bats a year, such that the sample is small. As such, take this more of an observation of what happened last year than something that necessarily represents a trend.
Derek Jeter
ISO to Left: .105
ISO to Center: .082
ISO to Right: .278
Almost all of Jeter’s power was to the opposite field. That is a startlingly large split in power, and was a greater dichotomy than that in Jeter’s career ISO.
Alex Rodriguez
Low Lvg. HR/FB: 20.3
Med Lvg. HR/FB: 21.1
High Lvg. HR/FB: 45.5
A-Rod hit flyballs with about the same frequency in all situations. However, when the game was on the line, he took the ball out of the ballpark with much greater frequency. Unclutch, indeed.
Nick Johnson
Low Lvg. BB/K 1.02
Med Lvg. BB/K 1.29
High Lvg. BB/K 1.42
Johnson did will in high leverage spots overall, but I found his increased patience in those spots fascinating. When the situation was important, Johnson became more likely to strike out, but also more likely to take a walk.
Nick Swisher
Home ISO: .168
Away ISO: .316
Most of Swisher’s power came on the road, despite the New Yankee Stadium being a homer haven. If he can maintain something close to his road performance while bumping his home power a bit, he could find himself at 35 or more home runs.
Curtis Granderson
Home HR/FB: 8.9%
Away HR/FB: 15.7%
Granderson simply did not get much bang for his buck on fly balls in Comerica. His road numbers were significantly better than his home numbers, particularly against lefties, giving hope that he might return to the superstar that he was in 2007 once he gets out of the large ballpark in Detroit.
Brett Gardner
wRC+ v. L 115
wRC+ v. R 93
Gardner actually played fairly well against lefties. If he continues that and Granderson is not able to turn it around against lefties, might Randy Winn become the platoon caddy for Granderson rather than Brett?
Randy Winn
wRC+ v. R 102
wRC+ v. L -9
Of course, if Winn cannot turn this around, he will not be caddying for anybody. He has pretty solid career numbers against lefties, so this seems to be an anomaly, but he did hit significantly fewer line drives and more fly balls against lefties, both bad signs.

Steve Goldman recently wrote an interesting post over at Pinstripe Bible about the Yankee lineup and the best place in it for Robinson Cano:
A career .306/.339/.480 hitter, Cano freezes up with runners on base. This was clearly demonstrated last season, when he batted only .255/.288/.415 with men on and .207/.242/.332 with runners in scoring position. Conversely, leading off an inning he hit an incredible .441/.459/.797. Batting with the bases empty, he hit .376/.407/.609. While Cano hasn’t been this extreme every year, he has been fairly consistent in this regard. He’s a career .256/.291/.398 hitter with runners in scoring position, .280/.312/.425 with men on, and .331/363/.528 with the bases empty.
This doesn’t mean that Cano isn’t a good hitter, but that he simply has limitations. To get the most out of Cano, a manager might keep him out of RBI spots. Now, when you have one of the best offenses in baseball, your whole batting order is an RBI spot. That’s why the second spot in the order is a place he might prosper. Even if the Yankees get another .400 OBP from their leadoff man, Cano would be batting with the bases empty 60 percent of the time, do his best hitting, and be on base for Mark Teixeira, A-Rod, et al. The downside is that you might get a few extra Cano double-play specials when the leadoff man does reach base.
Basically, Goldman suggests that the 2 slot in the order would be a good fit for Cano, being that it is not an “RBI spot” and would maximize what you can get from him. I think this idea has two flaws. Firstly, Cano will only be batting with the bases empty 60 percent of the time in his first at bat. After that at bat, all subsequent at bats will likely have the guys at the bottom of the order hitting before him as well as the leadoff man, meaning he will be in more RBI spots than Goldman suggests.
More importantly, the Yankees should not be ordering their lineup to do what is best for Cano while disregarding what is best for the club. I am quite certain that Cano’s career .339 OBP makes him a bad fit in the 2 spot, as you want someone in that slot to reach base for A-Rod and Teixeira. Rigging the lineup to help Cano in a way that will hurt the two sluggers does not seem like a great plan.
Furthermore, Goldman’s overall point presupposes the idea that we should expect Cano to continue to fail in “clutch” spots going forward simply because he has done so in the past. To steal a thought from Fack Youk, there’s a big difference between “hasn’t” and “can’t”. Just because Cano has not been able to perform as well with runners on in the past does not mean that he cannot. As Greg at Pending Pinstripes notes:
It is very evident that, to date, Cano has been very unclutch in his career. This doesn’t signal that he will be unclutch going forward. Another conclusion on clutch hitting from The Book is:
For all practical purposes, a player can be expected to hit equally well in the clutch as he would be expected to do in an ordinary situation.
This thought made me curious as to whether there was something changing in regard to Cano’s approach with runners on base that we could point to and say, “That is why he fails in the clutch.” Thankfully, SG over at RLYW looked at this issue recently. He examined luck factors, batted ball data, and pitch type, and found the following:
Honestly, I expected to see more of a split here in the underlying data, but it’s just not there. Cano’s results to this point with runners on base are markedly worse than his results with the bases empty, but it’s not because of any obvious change in his approach in the two scenarios, unless I’m missing something here or not considering something that I should be. I guess this is encouraging, because it means we really shouldn’t have any reason to think that Cano will continue to hit as poorly with men on base as he has so far.
Greg at PP had similar results in his study linked above, suggesting that nothing in the observable data reflects a change in approach by Cano with men on base. I would like to put forth an alternative theory, although I do not have much evidence to support it due to my inability to split certain data sets into bases empty v. men on base sections.
After Cano’s awful 2008, I made the following assertion:
Cano was flying open and jerking his head, leading to a multitude of soft popups. Rather than take those pitches up the middle or the other way, Robbie played into the pitchers hands by attempting to pull everything. Bad mechanics, rather than bad luck, were what killed Robinson Cano’s 2008.
This point was supported by Pitch F/x research done by Josh Kalk and the batted ball and swing data, and I am quite confident in its accuracy. Cano bounced back in 2009, and the data showed me the following:
Cano continued to expand his zone in 2009, but was more comfortable going with the pitch on the outer half. In fact, he made even more contact on those pitches than usual, leading to him striking out less. Increased and better contact on those pitches led to more of his fly balls leaving the ballpark than in the previous season, meaning he finally saw the benefits of trading ground balls for fly balls. New Yankee Stadium certainly helped, but his IsoP was almost as good on the road as it was at home. To sum up, I believe that Cano saw a BABIP increase because he was making better and more consistent contact on pitches on the outer half and out of the strike zone, leading to more homers and general power on fly balls than he got last season. Kevin Long worked particularly hard with Cano in the offseason regarding reaching that ball on the outer edges and going the other way with it, and I believe it paid off.
Cano’s spray charts, linked in the 2009 post, suggest that this interpretation of Cano’s performance has some merit to it. To state my conclusion succinctly, I believe Cano’s poor 2008 was the result of attempting to pull everything, and that his turnaround was the result of a focused attempt to take pitches on the middle and outer portions of the plate the other way.
What does this have to do with our discussion of clutch? Well, I would posit that Cano may have his 2008 issues regarding pulling the ball whenever there are runners on base. As Kevin Long and most other coaches would tell you, a player that attempts to pull everything is simply trying to do too much, attempting to change the entire game with one swing. That mindset snowballed on Cano in 2008, as the more he struggled, the more he attempted to alter things by crushing the ball. It may be possible that he always has that “trying to do too much” mindset when there are runners on base, and therefore fails to focus on taking pitches the other way and gets pull happy. If this is in fact the problem, some more work with Kevin Long might be able to solve it. (Anecdotally, because it does not really mean much in the way of proof, I would like to note that Cano’s 2008 numbers and his career numbers with men on base are very similar).
This is simply a theory, and I myself am not entirely convinced of it. I would just as soon believe that there is absolutely nothing behind Cano’s struggles with men on base, and that we should expect him to perform to his overall career averages regardless of the situation going forward. However, if you do believe that something must be changing with runners on base, I think this is as good a theory as any, and does have some factual underpinnings in terms of the 2008 data.
What are your thoughts on the issue?

Those of you who comment over at RAB might know that I have been slowly backing away from UZR for the last few months as I have learned more about the metric. I still use the metric, but it is with significantly more caution than in the past. My primary objection to the way it is used matches the concerns of Kris Liakos at Walkoff Walk:
With a typical outfielder getting at least twice as many plate appearances as he does defensive chances, when reading advanced stats we’ve all learned to trust a single season’s offensive numbers to paint an accurate portrait of a player, but to factor 3 years of defensive numbers. It makes sense, but it doesn’t solve the perception problem of simply looking at a guy’s UZR numbers and trying to figure what kind of fielder he is right now. Common sense tells you that you can’t simply average the numbers since each year will have a different number of defensive chances, and while the reality of a player that has posted a -14.2, +10, -6.9, +12.1 is that he’s got average range and average arm, it doesn’t look that way on the page.
So to solve the perception problem and stop dummies like me from misunderstanding/misinterpreting the meaning of UZR, I have a humble proposal. Do away with the year-to-year UZR rating of a player, and replace it with a single career number. Beginning in a player’s second year the problem of small sample size will start to dissipate and on a single look you’ll be able to make a quick judgement on just how much, or how little, he’s able to do in the field. To account for diminishing skills with age or trouble playing in a new park, each player’s career UZR can have a little up or down arrow like the Beckett Price Guides of old representing whether his number has gone up or down in his last 400 defensive chances.
To sum up, the problem with the way many use UZR is that they cite single season UZR’s and base other statistics, such as WAR, off of that number. However, the sample is too small to know whether the single season UZR represents an actual fluctuation in perfomance or a simple sampling error. As such, a single season UZR is, at best, a rough estimate, yet is treated as a much stronger indicator of defensive performance by fans and bloggers alike.
Now, Kris’s solution is that we simply replace UZR with career UZR in our analysis, with an up or down arrow representing recent trends to provide for some more immediate context. However, I think that this system would result in a lack of specificity to the extent that the statistic could not be used to measure current value in any real way. Instead, I propose a system whereby a 450 game average is computed each year, with more weight given to more recent seasons. To illustrate, a 2009 UZR would include the player’s last 450 games, presumably including 2009, 2008, and 2007, with 2009 given more weight in the computation than 2008, 2008 more than 2007, and so on. Essentially, each “single season UZR” is in actuality a composite of the player’s last 450 defensive games, which is a healthy enough sample from which to draw conclusions, yet provides specificity and context through the use of the most relevant recent sample and by providing greater weight to the most recent season. Single season UZR could still be utilized, but as a secondary measure used with careful reservations in the context of the 3 year number.
I would love some feedback on this idea. Chime in below.
Graphic Credit: Beyond the Boxscore.com
