Tuesday, March 09, 2004

Statistics as probabilities, not laws

I'm generally a pretty stats-friendly guy. You won't see me quoting RBIs or W-L records around here, and I'm quite familiar with the predictive value of certain metrics. I nearly majored in engineering, so I've done my fair share of excel spreadsheets. So this is by no means some anti-stathead rant.

However, it's important for anyone who works with those metrics to understand their predictive value in terms of probability, not certainty. We see this all the time in baseball. Mediocre players have break-out years. Dominating teams lose to abysmall teams. NL Pitchers get hits.

Some may look at instances like these and conclude that for all the noise the statheads make, they really don't know anything ("I don't care what the numbers say..."). I think this is asinine. There's no reason not to look at information in order to make a good decision or form a good opinion. Most of of the time, it's just laziness.

But predictive metrics aren't laws of nature; they simply do their best to extrapolate from the record of past events, but they don't govern future events. There are reasons for a player's declining production--his reaction time slows with age, for example--but those aren't the same thing as a downward trend in OBP. It isn't the trend in OPB that makes the player get worse--that trend is simply an indicator of it.

Another problem with predictive metrics is that they are only legitimate when drawn from a large enough sample size. Because the data is drawn (or "are drawn" if you're from the UK) from a large pool of performances, statistically based predictions test their validity against that large pool. Good metrics hold true for most performances. But therein lies a limitation: because the metrics are based upon the data of all players, they tend to assume all players behave similarly. It's impossible to make a customized predictive metric for each individual player because the sample size is too small. We have to be careful when we claim that "Player A will do such-and-such because of metric X," because we don't always have a good grasp on how closely that player conforms to the model that has been drawn from the composite of all performances.

Exceptions to the rule are usually dismissed as "outliers," which is a perfectly valid argument. Averages are, well, averages. But "outlier" performances are always a possibility. Failure to acknowledge this is just as asinine as failing to look at the data.

A really good example of this is in the applications of Park Effects. Most baseball fans will acknowledge that some parks are generally easier to hit in than others. By looking at every performance in every ballpark, we can drawn conclusions about how offense is likely to be helped or hindered due to the park. But it gets very difficult when we start applying those park effects to individual players. If Safeco Field generally reduces offense by 12%, we can draw some general conclusions about what new players are likely to face. But it's hard to say with any certainty, for example, whether Raul Ibanez will have splits like Mike Cameron (extreme) or Randy Winn (very little difference). The park effect isn't a law of nature; it's an identified pattern.

There are other issues with Park Effects that people smarter and more devoted than I are working on, and I'll certainly bow to their more sophisticated analysis of their likely effect on Ibanez. Really, my criticism here is not of statistically based analysis or of the complicated metrics, but of the misuse of them. The best writers understand the limitations of

It's easy to be smug, either as an old-school "baseball ain't played by a calculator" type or a stat-savvy "show me the emprical evidence" analyst. But it's far more productive to be honest about uncertainties, and makes baseball--well, all of life, really--more exciting.

No comments: