Sunday, December 12, 2004
Damned Lies and Statistics
We live in a world of numbers: polls, population statistics, crime rates, etc. Because of innumeracy, we often gloss over when reading or taking them in. In other words, we rarely actually evaluate the accuracy or reliability of these statistics. We often just take them on trust; trust that the numbers and their importance is being reported accurately, trust that the basic techniques of gathering the raw data is reliable and representative of an actual situation, and trust that the raw data is being presented correctly in statistics and charts.
In general, this book teaches you how to evaluate statistics. It doesn't have much actual math. If you can understand the example below, you can probably understand the rest of the book. Best gives great examples, many of which I was aware of but didn't know about the flaws in their creation or dissemination. Here's one example from the chapter on "Mutant Statistics".
Consider one widely circulated statistic about the dangers of anorexia nervosa (the term for eating dangerously little in an effort to be thin). Anorexia usually occurs in young women, and some feminists argue that it is a response to societal pressures for women to be beautiful, and cultural standards that equate slenderness with beauty. Activists seeking to draw attention to the problem estimated that 150,000 American women were anorexic, and noted that anorexia could lead to death. At some point, feminists began reporting that each year 150,000 women died from anorexia. (This was a considerable exaggeration; only about 70 deaths per year are attributed to anorexia.) This simple transformation -- turning an estimate for the total number of anorexic women in the annual number of fatalities -- produced a dramatic, memorable statistic. Advocates repeated the erroneous figure in influential books, in newspaper columns, on talk shows, and so on. There were soon numerous sources for the mistaken number. A student searching for material for a term paper on anorexia, for instance, had a good chance of encountering -- and repeating -- this wildly inaccurate statistic, and each repetition helped ensure that the mutant statistic would live on.[Update 11/11/05: In the interest of presenting a slightly counterbalancing factor in the numbers of women who die of anorexia each year, it should be noted that "anorexia" is rarely listed as the cause of death on death certificates. The figure above estimating that "about 70 deaths per year are attributed to anorexia" is apparently only citing those deaths which list anorexia as the "cause of death" on the death certificates. Like people with AIDS, women with anorexia rarely specifically die of it; technically, they die of the complications and secondary effects of long-term anorexia (e.g., renal or heart failure) which are more likely to show up on death certificates than anorexia. So the actual annual number of anorexia deaths is unknown but it is undoubtedly larger than 70. Just as obviously, the number is nowhere near the 150,000 figure. At a guess (and that's all it is: a guess by someone who doesn't have any knowledge on the subject), I'd estimate the lower end of annual anorexia deaths at about 120 and perhaps 500 at the upper end. I repeat: I have no evidence for either of these figures so don't bother to cite me as a source. I'm just an ignorant bastard spouting off.]
Yet it should have been obvious that something was wrong with this figure. Anorexia typically affects young women. In the United state each year roughly 8,500 females aged 15-24 die from all causes; another 47,000 women aged 25-44 also die. What were the chances, then, that there could be 150,000 deaths from anorexia each year? But, of course, most of us have no idea how many young women die each year ("It must be a lot..."). When we hear that anorexia kills 150,000 young women per year, we assume that whoever cites the number must know that it is true. We accept the mutant statistic, and may even repeat it ourselves. [emphasis in original]