Science Watch

In psychology, as in other sciences, replication is the gold standard. In theory, new knowledge doesn't make it into the canon until the studies that produced it have been verified, independently, by more than one researcher. But in practice, critics say the field rarely lives up to that ideal — and the result is a psychological literature rife with findings that may or may not be true, yet are generally accepted as valid.

Over the past two years, a series of events — including the unmasking of prominent psychologist Diederik Stapel's data fraud and controversy over the reproducibility of other studies — have focused researchers' attention on replication and related issues. But as some psychologists see it, these flare-ups are finally bringing light to problems that have needed attention for years.

"The fact that our scientific methodology is not perfect, and operates less than ideally, is not a new insight," says Brian Nosek, PhD, a social psychologist at the University of Virginia. Now, he and others are leading efforts to increase replication studies and open up access to data.

Unfortunately, these psychologists say, the incentive system at work in academic psychology is weighted against replication: There are no carrots to induce researchers to reproduce others' studies, and several sticks to dissuade them.

Among the top problems are that funding agencies aren't interested in giving money for direct replication studies and most journals aren't interested in publishing them. So researchers whose careers depend on winning grants and publishing studies have no incentive to spend time and effort redoing others' work.

The solution is to "revalue replication" in psychology, says Gary VandenBos, PhD, the executive director of APA's Office of Publications and Databases. "We need to put a strategy in place to get departments, journals and funding agencies to value replication."

He and the APA Publications and Communications Board have begun to do just that, appointing a task force that will look at ways to encourage replication research. The task force plans to come up with recommendations by April.

A broken system?

Researchers like Nosek and journal editors like VandenBos have been mulling over psychology's instances of problematic methodology and the limits of the discipline's publishing model for years, but their concerns gained a wider audience in 2011, when investigators found that Dutch psychologist Diederik Stapel had made up data for dozens of studies that were published in some of psychology's most prominent journals.

Stapel got away with his fraud for more than a decade, partly because he kept a tight lid on his data — not even providing it to his graduate students — and because no one challenged him on that or attempted to publish replications of his work.

While such outright fraud is assumed to be rare, over the past couple of years, more subtle controversies have emerged as well. In March 2011, Cornell University psychologist Darryl Bem, PhD, published a paper in the Journal of Personality and Social Psychology that found evidence for "Psi," or extra-sensory perception. Bem stands by his work, but many psychologists questioned his analysis and were incredulous that the study was published. At least one team, led by Richard Wiseman, PhD, of the University of Hertfordshire, was unable to replicate Bem's findings — but JPSP didn't publish that research because the journal does not publish replications. Psychological Science also rejected Wiseman's replication study, for the same reason. It was eventually published in the open-access journal PLOS One, and Wiseman set up a website for others to document their attempts to replicate Bem's study. And, in December, JPSP published a meta-analysis of the original studies, and replication attempts, by Carnegie Mellon University's Jeff Galak, PhD. But critics say that the situation exemplifies the difficulty of getting any replication work published.

A third controversy emerged last January, when PLOS One published a replication attempt of a classic priming experiment by Yale University's John Bargh, PhD. In the original 1996 experiment, Bargh found that participants who were "primed" by reading words related to the elderly later walked across a room more slowly than people who read neutral words. The classic experiment has given rise to an entire field of priming research, but in the PLOS One paper, a team led by Belgian researcher Stephane Doyen could not replicate the results. Bargh has criticized the team's methodology, while others defend priming research. But the controversy has again focused attention on replication. Nobel-prize-winning psychologist Daniel Kahneman, PhD, who discussed priming research extensively in his recent book "Thinking, Fast and Slow," has weighed in, suggesting that teams of priming researchers replicate one another's work in a round-robin process, to put to rest any doubts about the field.

A complex problem

The replication problem is compounded by other quirks of academic science. First, there's the widely acknowledged issue that studies with positive results are much more likely to be published than studies with negative results, whether they're replications or not. It's the "file-drawer problem," says Hal Pashler, PhD, of the University of California, San Diego. In other words, studies with negative results get shoved into psychologists' file drawers, never to be shared or published.

Studies with positive results, meanwhile, get published — whether those results represent a true finding or a false positive. Many people mistakenly think that the common practice of setting a significance p-value of .05 means that only about 5 percent of published results are false positives, but, Pashler says, this isn't true.

The actual number of false positives in the literature depends on an unknowable number — the percentage of the effects that psychologists search for that are, in fact, real. In an analysis in Perspectives on Psychological Science, Pashler and coauthor Christine Harris, PhD, estimated that if, for example, only 10 percent of the effects that psychologists search for are real, but all positive results are published, then setting a p-value of .05 would result in more than one-third of all positive findings reported in psychology journals being false positives.

And if no one tries to replicate those studies, then the false positives remain in the literature unchallenged.

Psychology isn't the only field facing this issue. In fact, Pashler and Harris based their analysis on a 2005 paper by biomedical statistician John Ioannidis called, "Why most published research findings are false," which examined the likely rate of false positives in biomedical research.

Psychology, Nosek says, is actually better positioned than most fields to address replication problems. "Psychology is really taking the lead in many ways," he says. "All of the sciences are confronting this. But we understand many of the ways that human factors can affect results. And so I hope that our work will be extended in other fields."

Possible solutions

That work includes Nosek's Reproducibility Project. He and dozens of other psychologists are attempting to reproduce as many studies as possible that were published in the 2008 volumes of three prominent journals: the Journal of Personality and Social Psychology, Psychological Science and the Journal of Experimental Psychology: Learning, Memory, and Cognition. The group is working on about 50 studies, and hopes to get 100 or more researchers join the project.

The goal, Nosek says, is both to investigate the reproducibility of a representative sample of recent psychology studies, and to look at the factors that influence reproducibility.

That's important, he says, because "being irreproducible doesn't necessarily mean a finding is false. Something could be difficult to reproduce because there are many subtle factors necessary to obtain the results. And that's important too, because we tend to overgeneralize results."

The Reproducibility Project is a one-off demonstration project. But replication proponents have broader suggestions as well, many of them outlined in a November special issue of Perspectives on Psychological Science devoted to the reproducibility problem. For example, Pashler and University of Virginia psychologist Barbara Spellman, PhD, have started a website called, where researchers can post their unpublished negative results. The site has gotten some positive attention, but so far only 19 studies are posted.

"We have a high ratio of Facebook likes to actual usage," Pashler says. "Everybody says that it's a good idea, but very few people use it."

The problem, he says, is probably that researchers — especially grad students and other young researchers likely to do replication work — have little incentive to post their findings. They won't get publication credit for it, and may only annoy the authors of the original studies.

"I'm happy we made [the site]," Pashler says "but so far it has just ended up spotlighting the incentive problem."

One idea to solve that problem: pre-registered replication studies. In this scenario, authors would propose a replication of an important, highly cited study to a journal. The author of the original study would review the replicator's methods, and the journal would agree, in advance, to publish the results — be they positive or negative.

"This would change the incentive system," Pashler says. "Currently, if you do a replication and it succeeds, you'll never publish it [because it's not interesting]. If it fails, you might publish it, but you'll have to fight with original authors …. This way, you'd be guaranteed a publication. I think it would have a great effect."

Meanwhile, Nosek and his graduate student Jeff Spies recently launched a website called the Open Science Framework, where researchers can register their studies in advance and log all of their data, workflow, and results as they proceed, then choose to make that data public if they wish. "That way, researchers don't have to decide whether to spend the extra time writing up information about a negative result at the end," Nosek says. The site went into public beta testing in November.

A movement toward more open-methods and open-data journals could also help increase the number of replication studies, by making it easier for researchers to access one another's full methods and data sets. One such journal is APA's Archives of Scientific Psychology, its first open-access, open-methods, open-data journal, which will debut this year.

In 2011, the APA Publications and Communications Board approved the recommendations of a data-sharing task force, which called for journals to make available all of the data on which published research reports are based. The Archives will be the first APA journal to put this into practice, by requiring authors to submit their full data sets and make them available for any researcher who requests them, according to Duke University's Harris Cooper, PhD, APA's chief editorial advisor and, with VandenBos, an editor of the new journal.

"This, along with a more transparent and detailed description of methodology, should facilitate future replication of articles published in the Archives," Cooper says.