Recent argy-bargy about a failed replication has exposed a disturbing belief in some corners of psychological research: that one experiment can be said to “conceptually replicate” another, even if it uses a completely different methodology.
John Bargh, a professor of psychology at Yale, made waves recently with a stinging attack on virtually everyone associated with a failed attempt to replicate one of his previous findings. The specifics of this particular tango de la muerte can be found elsewhere, and I won’t repeat them here, except to say that I thought Bargh’s misrepresentation of the journal PLoS One was outrageous, offensive, and an extraordinary own goal.
That aside, Bargh-gate has drawn out a more important issue on the idea of “conceptual replication”. Ed Yong's article, and the comments beneath, exposed an unusual disagreement, with some (including Bargh himself) claiming that Bargh et al.'s original findings had been replicated at length, while others claimed that they had never been replicated.
How is this possible? Clearly something is awry.
All scientists, and many non-scientists, will be familiar with the basic idea of replication: that the best way to tell whether a scientific discovery is real is to repeat the experiment that originally found it. Replication is one of the bedrocks of science. It helps scientists achieve consensus and it acts like an immune system, eliminating findings that are irreproducible due to methodological error, statistical error or fraud.
It also goes without saying that the most important aspect of replication is to repeat the original experiment as closely as possible. This is why scientific journal articles contain a Method section, so that other scientists can precisely reproduce your experimental conditions.
Enter the notion of “conceptual replication”. If you are a scientist and you’ve never heard of this term, you are not alone. The other day I did a straw poll of my colleagues – who are mostly experimental psychologists and neuroscientists – and got blank looks in response.
The basic idea is this: that if an experiment shows evidence for a particular phenomenon, you can “conceptually” replicate it by doing a completely different experiment that someone – the experimenter, presumably – believes measures a broadly similar phenomenon. Add a pinch of assumption and a healthy dose of subjectivity, and viola, you’ve just replicated the original ‘concept’.
I must admit that when I first heard the term “conceptual replication”, I felt like shining a spotlight on the clouds and calling for Karl Pilkington. Psychology is already well known for devaluing replication and we do ourselves no favours by attempting to twist the notion of replication into something it isn’t, and shouldn’t be.
Here are four reasons why.
Here are four reasons why.
1. Conceptual replication is assumption-bound and subjective
From a logical point of view, a conceptual replication can only hold if the different methods used in two different studies are measuring the same phenomenon. For this to be the case, definitive evidence must exist that they are. But how often does such evidence exist?
Even if we meet this standard (and the bar seems high), how similar must the methods be for a study to qualify as being conceptually replicated? Who decides and by what objective criteria?
2. Conceptual replications can be “unreplicated”
A reliance on conceptual replications can be easily shown to produce absurd conclusions.
Consider the following scenario. We have three researchers, Smith, Jones, and Brown, who publish three scientific papers in a sequence.
Smith gets the ball rolling by showing evidence for a particular phenomenon.
Jones then comes along and uses a different method to show evidence for a phenomenon that looks a bit like the one that Smith discovered. The wider research community decide that the similarity crosses some subjective threshold (oof!) and so conclude that Jones conceptually replicates Smith.
Enter Brown. Brown isn’t convinced that Smith and Jones are measuring the same phenomenon and hypothesises that they could actually be describing different phenomena. Brown does an experiment and obtains evidence suggesting that this is indeed the case.
We now enter the ridiculous, and frankly embarrassing, situation where a finding that was previously replicated can become unreplicated. Why? Because we assumed without evidence that Smith and Jones were measuring the same phenomenon when they were not. It’s odd to think that a community of scientists would actively engage in this kind of muddled thinking.
3. Conceptual replications exacerbate confirmation bias
Conceptual replications are vulnerable to a troubling confirmation bias and a logical double-standard.
Suppose two studies draw similar conclusions using very different methods. The second study could then be argued to "conceptually replicate" the first.
But suppose the second study drew a very different conclusion. Would it be seen to conceptually falsify the first study? Not in a million years. Researchers would immediately point to the multitude of differences in methodology as the reason for the different results. And while we are all busily congratulating ourselves for being so clever, Karl Popper is doing somersaults in his grave.
4. Conceptual replication substitutes and devalues direct replication
I find it depressing and mystifying that direct replication of specific experiments in psychology and neuroscience is so vital yet so grossly undervalued. Like many cognitive neuroscientists, I have received numerous rejection decisions over the years from journals, explaining in reasonable-sounding boilerplate that their decision "on this occasion" was due to the lack of a sufficiently novel contribution.
Replication has no place because it is considered boring. Even incremental research is difficult to publish. Instead, reproducibility has been trumped by novelty and the quest for breakthroughs. Certainty has given way to the X factor. At dark moments, I wonder if we should just hand over the business of science to Simon Cowell and be done with it.
First, we must jettison the flawed notion of conceptual replication. It is vital to seek converging evidence for particular phenomena using different methodologies. But this isn’t replication, and it should never be regarded as a substitute for replication. Only experiments can be replicated, not concepts.
Second, journals should specifically ask reviewers to assess the likely reproducibility of findings, not just their significance, novelty and methodological rigor. As reviewers of papers, we should be vocal in praising, rather than criticising, a manuscript if it directly replicates a previous result. Journal editors should show some spine and actively value replication when reaching decisions about manuscripts. It is not acceptable for the psychological community to shrug it's shoulders and complain that "that's the way it is" when the policies of journal editing and manuscript reviewing are entirely in our own hands.
Psychology can ill afford the kind of muddled thinking that gives rise to the notion of conceptual replication. The field has taken big hits lately, with prominent fraud cases such as Diederik Stapel producing very bad publicity. The irony of the Stapel case is that if we truly valued actual replication, rather than Krusty-brand replication, his fraud could have been exposed years sooner and before he had made such a damaging impact. This makes me wonder how many researchers have, with the very best of intentions, fallen prey to the above problems and ‘conceptually’ replicated Stapel's fraudulent discoveries?