Double Blind or Double Talk? Reading Medical Research with a Skeptical Eye

Once upon a time, not so long ago, a new drug was tested carefully by its manufacturer, demonstrated to be safe, and approved by the FDA. After it went on the market, people started reporting a lot of side effects. An independent researcher thought the drug manufacturer’s studies might have been biased (after all, they just wanted to make money), so he decided to do his own study. He put half of his subjects on the drug and half on placebo. As the study proceeded, a dismaying number of side effects started to appear. Particularly alarming: 15% of subjects reported eye problems, and 8% went blind. The Institutional Review Board halted the study before the drug could harm any more subjects. Statistical analysis of the data confirmed that side effects occurred significantly more often with the drug than with the placebo. The study was published in a respected peer-reviewed medical journal; it concluded that the drug was harmful and should be avoided.

Sounds pretty impressive? Sounds pretty scary! Based on this description, I’d be likely to avoid that drug like the plague. The study was randomized, placebo controlled, and double blind. But is it a good example of the gold standard, or merely fool’s gold? How can we tell? What should we look for?

Before you read my critique, you might like to try your own hand at critiquing the study. You can find it on-line at Health/Aspartame-Adverse-Reactions-1993.htm. It was published in 1993 in Biol. Psychiatry 34:1317, “Adverse Reactions to Aspartame: Doubleblind Challenge in Patients From a Vulnerable Population” by Walton, R.G., et al.

If the drug in question had been an antibiotic, I might have thought the study sounded more plausible. But in this case, the drug was aspartame, the artificial sweetener sold in every grocery store. I experienced a jolt of cognitive dissonance. My common sense, the FDA, the European Food Safety Authority, and a recent review of all the published literature on aspartame all agreed that aspartame was probably safe. Was this one article enough to make us all change our minds? At this point, I had three options: (1) I could disregard everything else I knew, panic, and throw out the box of NutraSweet in my kitchen, (2) I could file the study in the back of my mind and wait to hear if any other researcher found similar findings and if any professional group issued warnings based on it, or (3) I could read the study’s fine print to see if it was flawed enough that I should disregard it entirely. I decided to read it carefully: I suspected it might be amusing, and I was right. Studies of small value can be good for big laughs.

Bias Red Flags

The study was based on the author’s prior observations that lots of his depressed patients seemed to think they got headaches and other symptoms from aspartame. It compared a group of depressed patients to a group of non-depressed controls, and compared aspartame to a placebo. The aspartame and placebo were prepared in identical capsules, and the dose of aspartame was equivalent to 10-12 cans of diet soda a day. The study design was a crossover: patients were randomized to take either aspartame or placebo for seven days; then, after a “washout” period of three days taking nothing, they took the other for seven days.

The primary researcher was “independent” only in the sense that he was not on the manufacturer’s payroll. He admitted a prior bias against aspartame. That isn’t necessarily a problem, but it should raise a red flag. A biased researcher may be able to set his bias aside and still do good science. But unconscious bias can have subtle effects, even in the most ethical researchers, and his bias was pretty overt. The nondepressed control subjects were volunteers who worked at the same institution, which raises the possibility that they might have known the researcher and shared his bias.

They state that since “NutraSweet Company denied the request from the authors to purchase aspartame,” they bought USP grade aspartame elsewhere. That seems very curious to me; I wondered why they would deny the request. I wondered why they didn’t just buy the USP aspartame in the first place, since the NutraSweet company presumably uses a similar source and then adds other ingredients. I wondered if NutraSweet was aware that the requester was prejudiced against their product. I have contacted the company and asked if they could verify the information and explain why the request was denied. They have not answered.

Percentages and Raw Numbers

I said 15% of subjects reported eye problems and 8% went blind: that was misleading. There were actually only two individuals with eye problems. There were only 13 subjects in the entire study, so the 8% amounts to one subject who had a retinal detachment and lost the vision in one eye while he was taking the placebo. He was half of the 15%, and the other half was one subject who had a subconjunctival hemorrhage. That sounds bad, but it really isn’t. It’s just a bruise in an unusual place. It means a tiny bit of blood leaked out just under the surface of the white part of the eye and made a red spot. This can be the result of minor trauma and can occur spontaneously for no apparent reason. It is common, benign, painless, resolves quickly, and is usually insignificant. It occurred in a patient taking aspartame, but before this study, it had never been reported as a side effect of aspartame, and there was no earthly reason to blame aspartame.

I don’t see how anyone could justifiably blame these two isolated examples of different eye problems on aspartame. Why not assume they were chance occurrences until proven otherwise? If one patient happened to have an attack of hiccups while taking aspartame, I wouldn’t immediately assume aspartame was the culprit. What if one patient had had an auto accident during the study—would they have assumed that the aspartame had made the patient accident prone? Whenever you do a study, unrelated things are bound to happen to patients in both the treatment group and in the control group. You can weed these out by statistical comparisons of sufficient numbers of subjects.

Read this next bit carefully because it’s hard to believe. They tried to blame aspartame for an event that occurred in the placebo group! The person who went blind in one eye had been in the “aspartame-first” group; he had taken his last dose of aspartame six days earlier and had undergone three days of washout before starting the placebo. Instead of listing this as a random event for a subject taking a placebo, they decided it could have been a delayed reaction to aspartame.

Stop! Foul! You can’t do that!

They weren’t testing the hypothesis that aspartame could cause delayed reactions. They obviously didn’t even think that. If they hadn’t been pretty sure the effect of aspartame would “wash out” in three days, they wouldn’t have chosen a three-day washout period. If they could blame everything on aspartame whether it occurred in the aspartame group or the placebo group, there would be no way of falsifying their hypothesis, and the whole idea that they could have a meaningful control group collapses. They have essentially set up a “heads we win, tails you lose” situation: anything they find “proves” that aspartame is the bad guy.

The subjects were given a symptom checklist. “Listed symptoms were: headache, nervousness, dizziness, trouble remembering, binge eating, lower back pain, nausea or upset stomach, feeling blue or depressed, insomnia, uncontrollable temper outburst, and other (to be specified by the subject).” Eye problems were not on the list. This tells us that the researchers were not “looking” for any kind of eye problems. They didn’t suspect aspartame could cause eye problems; they appear only to have thought up that hypothesis during the study.

Mining the Data

The “other” category allowed for a bad practice known as data mining. Instead of specifying what the experiment was testing for, it allowed for a fishing expedition. If you look at enough factors, you can almost guarantee that you will find a spurious correlation somewhere just by chance. This is what went wrong with some of the studies of the effects of prayer: they started out looking for a difference in a specific endpoint like death, but when that didn’t pan out, they went back and looked at dozens of other factors until they found something else that did seem to show a significant difference. Okay, so the prayed-for patients didn’t live longer, but maybe they spent less time in the Intensive Care Unit, or reported fewer headaches, or took fewer pain pills, or slept better, or needed to use less deodorant…. if you just keep looking long enough, you will find something. It just won’t mean anything.

They originally planned to study 40 subjects, but only 11 patients completed the protocol before the study was prematurely terminated by a nervous Institutional Review Board (IRB). They had only partial data on two other patients who stopped before their week was up, but they threw that partial data right into the mix. That’s also a no-no.

Let’s look at the study population. The depressed group (only eight patients) was a mixed bag. It included two patients who required hospitalization and six patients treated as outpatients. It included three patients on no medication and five patients on three different antidepressant medications. If they found positive results, we would be left wondering whether the severity of the depression or their medication status had something to do with it. It would have been much better to use a homogeneous population of similarly depressed patients who were all off medication, or who at least were all on the same medication.

There were only five nondepressed volunteers, three of whom already believed they got headaches or other symptoms when they used aspartame. I doubt if 60% of the general population believes they get headaches from aspartame. This suggests a non-representative sample that may have volunteered for an aspartame study because they wanted to prove aspartame was harmful. It would have been better to use an aspartame-naïve group, or at least a group of patients who didn’t have any preconceptions about aspartame.

Controlled for What?

A control group is no good unless it really is a control group. Wallace Sampson, MD, told me about a study in his field of oncology that kept nagging at him until his partner finally spotted the flaw: the patients in the treatment group seemed to be living longer than the control patients, but they were actually dying at the expected rate for their kind of cancer, while the control patients were dying much sooner than expected. So instead of concluding that the treatment worked, they should have concluded that the treatment was useless, and that being assigned to a control group could kill you!

Incidentally, patients who believe aspartame gives them headaches have been tested in double-blind studies, and they don’t get headaches when they don’t know they’re getting aspartame. Neither did the five non-depressed volunteers in this study: three of them believed aspartame gave them headaches, but four of the five reported headaches while taking placebo, and only one headache occurred on aspartame. In fact, you might conclude from this that not taking aspartame is four times as likely to cause headaches as taking aspartame—or that aspartame prevents headaches! I doubt that, too.

Three patients complained that they felt like they were being “poisoned”—doesn’t that strike you as being a bit strange? Strange for them to say, and strange for the researchers to include in the report. “Feeling poisoned” is not a symptom like “headache;” it’s more akin to a paranoid delusional statement.

Subjects in these studies can be sneaky and devious, or simply curious. It’s only human nature to try to guess whether you’re getting the active drug or the placebo, especially since people are often motivated to volunteer for studies when they have preconceived ideas about what is being tested. They look for taste, clues, slight differences in the preparations that the researchers haven’t noticed; sometimes they take the capsules apart and check. To rule this out, it is a simple matter to do a sort of exit poll, asking them whether they thought they had been given the drug or the placebo. If their guesses were better than chance, the results would have to be thrown out. No exit poll was done in this study.

They did a statistical analysis and found significant differences. But significant differences may not mean much when the numbers are so small. If even one or two more patients had completed the protocol, the findings might have been reversed. I’m no statistician, but I am bothered by the way they handled the data, especially the way they assigned “points” based on symptom severity, the fact that they included so many symptoms not on their original list (in the “other” category), and the fact that they lumped patients who had completed the protocol with those who hadn’t.

As an example of how small numbers can lead to false conclusions, imagine you were studying the average net worth of Americans and your small random sample just happened to include Bill Gates.

Do Placebos Cause Retinal Detachment?

The researchers concluded that aspartame should be avoided in patients with mood disorder.

Mood disorder? They had only studied patients with one specific type of mood disorder: major depression. They should have said it should be avoided in patients with major depression. No, wait! They shouldn’t have said that either. That conclusion is not justified by the data. They might just as well have concluded that placebos cause retinal detachment and should be avoided in psychiatry patients.

Good scientists know that no one experiment is definitive. They know that no experiment is perfect; there are flaws in even the best of them. In their reports they recognize and discuss flaws they themselves are aware of in their own work. When they write their conclusions, they usually include a disclaimer, something like “If these results are confirmed by other studies….” In this case, a good scientist would have said, “Although no conclusions can be drawn from these findings, they suggest that patients with depression might be more susceptible to side effects from aspartame. We recommend further investigation.”

This was a “blind” study in more ways than one. Maybe we should have a new category: triple blind. It could refer to studies whose researchers are blind to fatal flaws in their own double blind studies.

It’s pretty discouraging to learn how untrustworthy a study may be, even when it seems to fulfill the “gold standard” criteria: double blind, placebo controlled, randomized, and published in a peer reviewed journal. Does that mean we shouldn’t believe anything we read? How can we ever hope to know the truth?

Fortunately, the scientific enterprise has its own self-correcting mechanisms that eventually sort things out. Studies that are wrong will be superseded by better studies with different results. Studies that are right will be corroborated by other good studies. Systematic reviews of the literature will be done by experts in the field. Critics who know what to look for will pick apart the studies with a fine-toothed comb. The weight of evidence on the correct side will accumulate until it tips the balance towards an eventual consensus.

The study in question is now 14 years old. Its findings have not been replicated. Aspartame is still on the market. No warnings have been issued by professional or government groups. Guidelines for treating depressed patients do not recommend avoidance of aspartame. No, we can’t prove that aspartame doesn’t cause more symptoms in depressed patients or make you go blind after you stop taking it… but with the evidence available today, it would be perverse to believe that it does.

The Impact of Bad Studies

You might wonder why I chose this particular study as an example. I picked it because I encountered a woman who truly believes this study proves that aspartame makes you go blind. It would be bad enough if this were her isolated misconception, but she is a rabid anti-aspartame activist, and she is publicizing the study as the gospel truth. She contacted me after she read something I wrote in Skeptic magazine. I had said I suspected an astronaut/ruptured disc story was an urban myth “right up there with the story that NutraSweet was originally developed as an ant poison.” She wrote me to ask, “How come when rats are fed aspartame they die?” She proceeded to bombard me with anecdotal horror stories about aspartame, concluding, “If you think aspartame is an urban legend, bon appetite [sic]!” I asked her if she would be open to looking at new evidence or if her mind was irretrievably made up. She answered, “I have found the truth.” She knows that aspartame interacts with all drugs and vaccines, embalms you, gives you cancer, seizures, and makes you go blind. I don’t think she’s accused it of causing global warming yet, but that may come.

I tried to explain why anecdotes are not good evidence, and how the “post hoc ergo propter hoc” fallacy sometimes leads to false conclusions. I said, “The idea that you can determine X causes Y simply by demonstrating that Y is temporally associated with X…is wrong: a temporal association is necessary but not sufficient to establish cause.” She replied, “your Y temporary associated with X and such does not convince me at all.”

Her kind of thinking is rampant in our society. Some of these people are outspoken, make good use of the media, and influence others. They have learned that the word “scientific” carries authority, and they are trying to co-opt science for their own purposes. They love it when they can point to a randomized, placebo controlled, double blind study to support their claims.

We skeptics have our job cut out for us. Most of us can tell the difference between science and pseudoscience, but it’s a bit harder to distinguish good science from not-so-good science. And it’s even harder to try to explain the difference to the public. A nasty job, but somebody has to do it.

Even if you don’t have the knowledge or skills to critique a research report, there is something we can all do. When you hear that a claim is supported by a double-blind study, you can ask who disagrees with it and why. A good scientist doesn’t ignore conflicting evidence—he discusses it. No study is so good that it is exempt from criticism. Real scientists love to pick each other’s work apart and to argue, and their arguments are an essential part of science’s methodical search for the truth. There are usually two sides to every story, and if you’re only hearing one side, it should raise suspicions.

Needless to say, I haven’t thrown out my box of NutraSweet.

This article was originally published in Skeptic magazine.

Dr. Hall is a contributing editor to both Skeptic magazine and the Skeptical Inquirer. She is a weekly contributor to the Science-Based Medicine Blog and is one of its editors. She has also contributed to Quackwatch and to a number of other respected journals and publications. She is the author of Women Aren’t Supposed to Fly: The Memoirs of a Female Flight Surgeon and co-author of the textbook, Consumer Health: A Guide to Intelligent Decisions.

Scroll to top