IMPORTANT BLOG ANNOUNCEMENT

PLEASE CHANGE YOUR BOOKMARKS AND FEEDS TO THE NEW URL, YANKEEANALYSTS.COM. TYU IS IN NO WAY AFFILIATED WITH THE NEW YORK YANKEES OR YANKEES UNIVERSE.
Jan 112010


Those of you who comment over at RAB might know that I have been slowly backing away from UZR for the last few months as I have learned more about the metric. I still use the metric, but it is with significantly more caution than in the past. My primary objection to the way it is used matches the concerns of Kris Liakos at Walkoff Walk:

With a typical outfielder getting at least twice as many plate appearances as he does defensive chances, when reading advanced stats we’ve all learned to trust a single season’s offensive numbers to paint an accurate portrait of a player, but to factor 3 years of defensive numbers. It makes sense, but it doesn’t solve the perception problem of simply looking at a guy’s UZR numbers and trying to figure what kind of fielder he is right now. Common sense tells you that you can’t simply average the numbers since each year will have a different number of defensive chances, and while the reality of a player that has posted a -14.2, +10, -6.9, +12.1 is that he’s got average range and average arm, it doesn’t look that way on the page.

So to solve the perception problem and stop dummies like me from misunderstanding/misinterpreting the meaning of UZR, I have a humble proposal. Do away with the year-to-year UZR rating of a player, and replace it with a single career number. Beginning in a player’s second year the problem of small sample size will start to dissipate and on a single look you’ll be able to make a quick judgement on just how much, or how little, he’s able to do in the field. To account for diminishing skills with age or trouble playing in a new park, each player’s career UZR can have a little up or down arrow like the Beckett Price Guides of old representing whether his number has gone up or down in his last 400 defensive chances.

To sum up, the problem with the way many use UZR is that they cite single season UZR’s and base other statistics, such as WAR, off of that number. However, the sample is too small to know whether the single season UZR represents an actual fluctuation in perfomance or a simple sampling error. As such, a single season UZR is, at best, a rough estimate, yet is treated as a much stronger indicator of defensive performance by fans and bloggers alike.

Now, Kris’s solution is that we simply replace UZR with career UZR in our analysis, with an up or down arrow representing recent trends to provide for some more immediate context. However, I think that this system would result in a lack of specificity to the extent that the statistic could not be used to measure current value in any real way. Instead, I propose a system whereby a 450 game average is computed each year, with more weight given to more recent seasons. To illustrate, a 2009 UZR would include the player’s last 450 games, presumably including 2009, 2008, and 2007, with 2009 given more weight in the computation than 2008, 2008 more than 2007, and so on. Essentially, each “single season UZR” is in actuality a composite of the player’s last 450 defensive games, which is a healthy enough sample from which to draw conclusions, yet provides specificity and context through the use of the most relevant recent sample and by providing greater weight to the most recent season. Single season UZR could still be utilized, but as a secondary measure used with careful reservations in the context of the 3 year number.

I would love some feedback on this idea. Chime in below.

Graphic Credit: Beyond the Boxscore.com

12 Responses to “Being Smart On UZR”

  1. awesome idea. mathematical brilliance. just the all UZR thing is to simple to really judge a player’s defensive skills.  (Quote)

    [Reply To This Comment]

    mendel Reply:

    you’re right. its not a perfect measure of a players defensive ability. there is no way to take in account all the other intengibles of defensive skill  (Quote)

    [Reply To This Comment]

  2. You might try an exponential filter on the year to year UZR. This is one of the simplest filters for time series data, used for example in real-time feedback control systems, such as paper mills, to control quality parameters. In this filter one takes the most recent value and averages it with the value of the filter from the time before the most recent data point. So the 2009 filtered UZR would be the 2009 UZR averaged with the 2008 filtered result. The 2008 filtered result is itself an average of 2008 data plus the 2007 filtered result. In this way the latest data has the largest influence but all the old data is there as a moderating influence.

    This filter is used in exactly the same scenarios as UZR – where the input data is “noisy” for one reason or another (in this case under-sampling). If you wish to be more sophisticated you can apply the weighting function “alpha” to select the balance you prefer between the latest data and earlier data. Alpha can be a number like 0.8, in which case the latest year’s data is multiplied by 0.8 and the earlier data by (1-0.8 = 0.2), to weight the more recent data more heavily. Or vice versa, with alpha less than 0.5. Here is a link to a description of the exponential filter: http://www.statistics.com/resources/glossary/e/expfilt.php  (Quote)

    [Reply To This Comment]

    Moshe Mandel Reply:

    That is a very interesting addition. If I am reading it correctly, it would be similar to using weighting, but it would do a better job of keeping all the inputs properly in context by “filtering” each data point. If I have the time this semester, I might try and work more on this to come up with a decent model. Not sure about how much time I will need and have, so we will see.  (Quote)

    [Reply To This Comment]

  3. I’ve backed off UZR mainly because you and I think Steve because you made these concerns known, so I think this is great.  (Quote)

    [Reply To This Comment]

  4. Well done, Moshe. UZR is so over-used. From what I’ve heard, there are better defensive statistics emerging.  (Quote)

    [Reply To This Comment]

    Chris H. Reply:

    Like what? Just curious, since I’m pretty UZR-dependent.  (Quote)

    [Reply To This Comment]

    lenNY's Yankees Reply:

    Not sure exactly. I just heard ESPN is coming out with stats that are calculated through cameras. Supposedly this year too, but I don’t know when it will be ready.  (Quote)

    [Reply To This Comment]

    Moshe Mandel Reply:

    Field F/x and stuff like that is supposed to advance things way past where we are now. If we can accurately measure ball speed and spin, as well as distance travelled, we can really break some barriers in measuring defense.  (Quote)

    [Reply To This Comment]

  5. As a genral rule a fileder gets as many opportunities in a game as a hitter gets plate appearences, correct? Why does a season not offer enough fielding opportunities to judge a players UZR, but it does offer enough PAs to judge a players OBP? Is the standard deviation that much higher for fielding?

    My issue is that I don’t like 3 year averages as defense can deteriorate quickly, especially at the end of a player’s career.  (Quote)

    [Reply To This Comment]

    Moshe Mandel Reply:

    No, that’s not correct as far as I know. Fielders get at best about half as many defensive chances a game than PA. As such, 300 games should be enough, but I expanded to 450 because defensive stats are put together by stringers and end up with more errors in the sample than offensive stats, which are fairly concrete objective events.  (Quote)

    [Reply To This Comment]

Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2011 TYU Suffusion theme by Sayontan Sinha