13 June 2006

Stats and Baseball

After reading this excellent post at Yanksfan vs Sox fan (via 2GD) on the statistical analysis of whether Big Papi is a "clutch" hitter, I can't help but chime in as both a statistician and baseball fan. More than any other sport, baseball fans and pundits alike love to troll through huge quanitites of numerical data to try to find interesting trends and observations about their favourite players.

Unfortunately nearly all such analyses are statistically bogus. In this case, the question is whether David Ortiz is a great "clutch" hitter, that is, he steps up his game in situations that are most important. The author then proceeds to parade a lot of numbers to argue his point. He makes the common mistake of presenting a trend (e.g. someone performs better on Wednesdays vs. Thursdays) without asking whether the data at hand are enough to prove that that trend is significant. In statistics it's all about sample size — whether you have sufficient observations to draw confident conclusions from your data.

If, for instance, I told you that it rained today, a Tuesday and was sunny yesterday, a Monday. Nobody would believe me if I then turned around and said, "It rains way more often on Tuesday than Monday!" In small samples, of course, random chance creates perceived patterns (such as rain correlating with Tuesday) where none actually exist.

In baseball we fool ourselves into thinking that we have enough observations to make all kinds of statements in which we have no confidence. In this specific case of clutch hitting, the author makes a whole series of claims with fairly small sample sizes, but lets look at his best case: batting average with runners on base vs. batting average with the bases empty. Ortiz has had 1861 total Red Sox at bats, distributed pretty evenly between these two scenarios (947 bases empty, 914 with one or more runners). He has had 265 hits in the first case (for a 0.280 average) and 280 hits in the second case (for a 0.306). So he's got more hits in fewer tries, thus the higher average with runners on. Regardless of whether this is a good measure of clutch performance (which is an entirely separate argument) we can ask whether these numbers actually mean something or whether they could've arisen by chance. Does Ortiz hit better with runners on?

In short, these numbers can't answer the question. When performing a simple test of statistical signficance, these values could easily have arisen by chance. We could easily have seen this discrepancy by dividing his at bats into those on where an odd number of fans were in the stadium vs. those where an even number of fans were watching. And this really makes sense when you think about it carefully. Even with nearly 2000 observations we're trying to gain insight into a very tiny difference: 0.280 vs 0.306! In baseball the difference between a guy with a career 280 average and a guy with a 306 is pretty big, but in almost any other circumstance we'd round both of these to an even 30% and call it a day. You would need tens of thousands of observations to demonstrate that Ortiz hits better with men on base with even modest confidence.

Keep in mind that this is actually a pretty big sample size for baseball. Many times people quote some 1-for-10 and say that Pitcher X "owns" Hitter Y. This is an even bigger joke, since a guy hitting .300 is likely to have only one hit in any given ten at-bats! I certainly hope that the guys actually working for ball clubs have a better handle on this than the average pundit.

