Misleading Statistics

Let me tell you about the perfect investment offer. Each week you will receive a share recommendation from a fund manager, telling you whether the stock’s price will rise or fall over the next week. After ten weeks, if all the recommendations are proved right, then you should be more than willing to hand over your money for investment. After all, there will be just a one-in-a-thousand chance that the result is down to luck.

Alas, this is a well-known scam. The promoter sends out 100,000 e-mails, picking a stock at random. Half the recipients are told that the stock will rise; half that it will fall. ...

This is a problem that has dogged scientists across many disciplines. There is a natural bias in favour of reporting statistically significant results—that a drug cures a disease, for example, or that a chemical causes cancer. Such results are more likely to be published in academic journals and to make the newspaper headlines. But when other scientists try to replicate the results, the link disappears because the initial result was a random outlier. The debunking studies, naturally, tend to be less well reported.

- False Hope, Economist, Feb. 21, 2015, http://www.economist.com/news/finance-and-economics/21644202-most-trading-strategies-are-not-tested-rigorously-enough-false-hope

Do the drugs work? After all, regardless of the theory, that is the practical question. In his spare, remarkably engrossing book, The Emperor’s New Drugs, Kirsch describes his fifteen-year scientific quest to answer that question about antidepressants. When he began his work in 1995, his main interest was in the effects of placebos. To study them, he and a colleague reviewed thirty-eight published clinical trials that compared various treatments for depression with placebos, or compared psychotherapy with no treatment. Most such trials last for six to eight weeks, and during that time, patients tend to improve somewhat even without any treatment. But Kirsch found that placebos were three times as effective as no treatment. That didn’t particularly surprise him. What did surprise him was the fact that antidepressants were only marginally better than placebos. As judged by scales used to measure depression, placebos were 75 percent as effective as antidepressants. Kirsch then decided to repeat his study by examining a more complete and standardized data set.

The data he used were obtained from the US Food and Drug Administration (FDA) instead of the published literature. When drug companies seek approval from the FDA to market a new drug, they must submit to the agency all clinical trials they have sponsored. The trials are usually double-blind and placebo-controlled, that is, the participating patients are randomly assigned to either drug or placebo, and neither they nor their doctors know which they have been assigned. The patients are told only that they will receive an active drug or a placebo, and they are also told of any side effects they might experience. If two trials show that the drug is more effective than a placebo, the drug is generally approved. But companies may sponsor as many trials as they like, most of which could be negative—that is, fail to show effectiveness. All they need is two positive ones. (The results of trials of the same drug can differ for many reasons, including the way the trial is designed and conducted, its size, and the types of patients studied.)

For obvious reasons, drug companies make very sure that their positive studies are published in medical journals and doctors know about them, while the negative ones often languish unseen within the FDA, which regards them as proprietary and therefore confidential. This practice greatly biases the medical literature, medical education, and treatment decisions.

Kirsch and his colleagues used the Freedom of Information Act to obtain FDA reviews of all placebo-controlled clinical trials, whether positive or negative, submitted for the initial approval of the six most widely used antidepressant drugs approved between 1987 and 1999—Prozac, Paxil, Zoloft, Celexa, Serzone, and Effexor. This was a better data set than the one used in his previous study, not only because it included negative studies but because the FDA sets uniform quality standards for the trials it reviews and not all of the published research in Kirsch’s earlier study had been submitted to the FDA as part of a drug approval application.

Altogether, there were forty-two trials of the six drugs. Most of them were negative. Overall, placebos were 82 percent as effective as the drugs, as measured by the Hamilton Depression Scale (HAM-D), a widely used score of symptoms of depression. The average difference between drug and placebo was only 1.8 points on the HAM-D, a difference that, while statistically significant, was clinically meaningless. The results were much the same for all six drugs: they were all equally unimpressive. Yet because the positive studies were extensively publicized, while the negative ones were hidden, the public and the medical profession came to believe that these drugs were highly effective antidepressants.

Kirsch was also struck by another unexpected finding. In his earlier study and in work by others, he observed that even treatments that were not considered to be antidepressants—such as synthetic thyroid hormone, opiates, sedatives, stimulants, and some herbal remedies—were as effective as antidepressants in alleviating the symptoms of depression. Kirsch writes, “When administered as antidepressants, drugs that increase, decrease or have no effect on serotonin all relieve depression to about the same degree.” What all these “effective” drugs had in common was that they produced side effects, which participating patients had been told they might experience.

It is important that clinical trials, particularly those dealing with subjective conditions like depression, remain double-blind, with neither patients nor doctors knowing whether or not they are getting a placebo. That prevents both patients and doctors from imagining improvements that are not there, something that is more likely if they believe the agent being administered is an active drug instead of a placebo. Faced with his findings that nearly any pill with side effects was slightly more effective in treating depression than an inert placebo, Kirsch speculated that the presence of side effects in individuals receiving drugs enabled them to guess correctly that they were getting active treatment—and this was borne out by interviews with patients and doctors—which made them more likely to report improvement. He suggests that the reason antidepressants appear to work better in relieving severe depression than in less severe cases is that patients with severe symptoms are likely to be on higher doses and therefore experience more side effects.

To further investigate whether side effects bias responses, Kirsch looked at some trials that employed “active” placebos instead of inert ones. An active placebo is one that itself produces side effects, such as atropine—a drug that selectively blocks the action of certain types of nerve fibers. Although not an antidepressant, atropine causes, among other things, a noticeably dry mouth. In trials using atropine as the placebo, there was no difference between the antidepressant and the active placebo. Everyone had side effects of one type or another, and everyone reported the same level of improvement. Kirsch reported a number of other odd findings in clinical trials of antidepressants, including the fact that there is no dose-response curve—that is, high doses worked no better than low ones—which is extremely unlikely for truly effective drugs. “Putting all this together,” writes Kirsch,

leads to the conclusion that the relatively small difference between drugs and placebos might not be a real drug effect at all. Instead, it might be an enhanced placebo effect, produced by the fact that some patients have broken the blind and have come to realize whether they were given drug or placebo. If this is the case, then there is no real antidepressant drug effect at all. Rather than comparing placebo to drug, we have been comparing “regular” placebos to “extra-strength” placebos.
- Marcia Angell, The Epidemic of Mental Illness: Why?, New York Review of Books, JUNE 23, 2011

Show php error messages