All studies are wrong, unless they’re not
You know me. I’m always late to the internet parties.
Recently, a few more click-bait sites have gone on to talk about how most research findings are false. A few months ago, the article, “Why you shouldn’t believe that exciting new medical study.” was making the rounds or, as the young folks say, “trending”. Yes, it has taken me that long to get my thoughts together on it.
The crux of the article was centered around research and case studies that have shown studies published with exciting results, subsequently disproved. At the end of the article, Ms. Belluz comments on how the atmosphere of reporting has changed in the Information Age and on how the use of summary-type research (such as systematic reviews) tempers that which appears glittering, but might not be gold.
I think there are two main issues at play here: The reality of replication and its relationship with the scientific process; and the ever-evolving trend of science as entertainment. These two issues have high interplay when it comes to popular media science “discussions” (as contrasted from genuine scientific debate)
The other side of replication
The first question I think that crops up amongst lay-readers of science reports in the media is, “Why are studies not repeated?” The answer to this question is not that complex, even if it has multiple parts.
First and foremost, in any research field, it is of utmost important to have context. Very few pieces of research are stand-alone episodes. If you watch a single episode of Game of Thrones, you probably don’t have a good idea of the series (except for the fact that there’s a lot of nudity and blood) Even in research whose purpose is to contextualize (like systematic reviews or regular reviews), it _still_ doesn’t stand alone.
The problem with science reporting is that its job is two-fold and at odds with the scientific process: 1) Just as with the news, all reporting has to produce changing content on a schedule. In the news, a “slow news” day brings frankly unimportant events to the forefront (“My cat dialed 911!”). 2) The importance of headline events doesn’t really change because there is a socialized value to that coveted spot. A story that makes the headline is deemed important because it is a headline. And while before, we only had a few headlines available to us, the Internet has made literally millions of headlines possible, virally spreading to your attention.
Of course, we’re not dumb. For the most part, we are able to recognize stories that are basically moronic and filter them out (pretty much anything on Buzzfeed, for instance.) Some people are better at doing it than others. No one is perfect.
The inherent problem with media, as we all know, is that someone else is deciding what is important enough to tell us about it. And while this is changing with vehicles like Twitter, to some extent, we accept this limitation. We accept that will not know everything that is going on in the world and that there are likely issues that are important to us that will never be reported; and if this is the case, that we are going to have to seek out that information on our own.
These inherent problems don’t go away with science reporting. They are, however, magnified, because science is not simple. Science is constantly building upon itself.
To understand world events, you need to develop a sense and knowledge of history. Knowing that a current day conflict arose from events that occurred centuries in the past allows you to interpret the current conflict’s context. However, to understand the personal immediate implications of a conflict (i.e. Is this important to me?), you just need to know that it is happening and where is it, geographically (roughly, like close or far away) in relation to you. There’s a tsunami in Indonesia? Are you close to, or far away from Indonesia?
Science does not work this way. Health science, even less. And this is for two main reasons:
1) To understand a study that was done today, you actually do have to have an idea of where it came from—not only from the perspective of the studies that came before it upon which it was built, but also from the perspective of why it was done in the first place.
2) Health science is ALWAYS “close to” you. Even an Ebola outbreak in a distant country is “close to” you. A study on octogenarians, even though you are in your 20’s, is still “close to” you. There is no such thing as health research that is “far away” from you.
This means that of these two things that cause stories to be important in reporting, one (the proximity of the event to you) only has one option: important. That is one less filter that the reporting has to pass to make it through your bullshit detector.
That leaves us with the headline effect. That which a media reporter deems to be worth reporting is assigned higher importance status to the reader than that which is deemed not worth reporting, regardless of its inherent worth.
The headline effect is magnified when you lack the prerequisite knowledge to place the story into context. The next panicky headline that gummy bears will kill you gets your attention because it is, by default, ‘close to’ you (particularly if you eat gummy bears) and because you have no idea about what the overall context of gummy bear research looks like. For all you know, it’s a slow health news day, and this study is an underpowered, non-generalizable piece of garbage that disagrees with the vast majority of previous research. The problem is that to gain the knowledge to put it into context requires an inordinate amount of work, and possibly, an inordinate amount of previous education that only the craziest (and yet also the most boring) of people would ever pursue.
So, from the critical perspective of media, those are two OTHER inherent reasons why you shouldn’t believe the next big medical story that you read: 1) Someone else is deciding its importance for you by giving it attention and it can be very difficult for YOU to obtain the relevant information that will enable and 2) it’s almost always going to read like it applies directly to you (i.e. it is always “close to you”, not “far away”) even if it doesn’t.
I promised you a discussion about repeatability and I still haven’t talked about it. And that is because I think it is difficult to discuss the issues behind verification studies without understanding the inherent issues with the way most of the world consumes health research. Nuanced on top of the issues I’ve presented here, there is the inherent bias in media reporting. Scientific progress occurs through argument. A study is the presentation of an argument with data to back up its statement. Counter arguments occur with new studies. Rare is the case where a study is so definitive so as to be the final say; and in most media reports, you will never see the ‘other side’ of the argument, even if there’s an attempt to provide a “balancing opinion” (which, almost never happens)
There’s also the issue of the knowledge gap in research reporters, who are supposed to help the public translate the science into something digestible and perhaps practical. However, most reporters are specialized. They don’t generally span several areas, each of which has its own jargon, controversies, and history. The nature of the deadline also means that they can be often caught in a bind trying to reconcile the knowledge gap and selling the thesis of their story, which can be why, as a scientist, it can be frustrating to be in an interview with the media who would like to push an angle that is often, simply not there.
So remember, if you’re reading about a study in the media, you are reading the actual study through the lens of another person’s eyes who may or may not have the requisite knowledge to interpret it, even if they talk with the expert who performed it in the first place. This isn’t to say that it’s inherently unreliable; but that the lens can be blurry depending on who’s doing the looking.
OK, with all that being said, let’s finally talk about verification studies.
Promising, even revolutionary results need verification. So why aren’t there more studies that try to replicate them? Why is most medical research wrong? And why aren’t scientists protecting us from wrong results?
So many questions; so many answers.
Here are a few reasons why studies aren’t replicated:
1) They’re not actually important, or they’re not actually good.
A big splash in your eyes can actually be a complete non-event in a field’s eyes. Remember that a study being important to you is largely dictated by the fact that it has presented to you as important (the headline effect). In the larger context of the direction of the field of the study, it might actually be quite insignificant and simply not worth repeating. In health care, studies that don’t directly impact clinical practice are less likely to be replicated; despite the fact that future studies that DO impact patient care can be based on “falsely promising” results, thus potentially perpetuating an accidental result.
Studies can also just be poorly done, but have promising results that can grab reporters’ attention, but just don’t pass muster in peer researchers’ eyes. As shown in this little ‘prank’, “I fooled millions into thinking chocolate helps weight loss. Here’s how,” it’s not hard for reporters to skip over to the sexy results without doing much deeper research or having any understanding of the topic, the field, or the study itself. And the reality is that most researchers are just too occupied with doing stuff that IS going to move their field forward rather than having to babysit the media.
2) They are being replicated, but are not yet published
To go from study inception and design to publication can be years. My PhD protocol was a full year in the making, followed by 2 years of data collection and then about a year of analysis and writing before it was ready to submit for publication. And it was a replication study of sorts based on a series of promising studies, the first of which was published 4 years prior to when I started designing it. That means that it took almost 10 years (ok..9) to publish a study to verify or falsify promising results. In the meantime, studies that have ’spurious’ findings sit in the literature, guiding decisions; but THAT’S THE SCIENTIFIC PROCESS. The verification delay can be extensive, particularly in large-scale, or long-term studies; and you simply cannot rush the timeline, except possibly at the end if the data really comes together and you can find a willing and reputable journal to print.
3) They are impossible to replicate
Several years ago, a multi-centre randomized controlled trial was published on the use of magnesium in pregnant women with pregnancy-induced high blood pressure (dubbed the Magpie trial). It involved 10 141 subjects in 33 countries. It was considered the definitive answer to whether pregnant women with blood pressure problems should be treated with magnesium. It has never been replicated because the resources to do so would be astronomical with the possibility of no additional benefit. Other studies involving extreme long-term follow-up, such as the Nurses’ Health study (which started in 1976) are impossible to verify unless someone had the foresight to start a second observation cohort to trail behind the first (which the Nurses’ Heath study DID do, in 1989—13 years after the first cohort and therefore, verification of findings in the first cohort can’t be verified for at last 13 years after the first. They are now in recruitment for the third cohort)
4) They’ve been replicated but can’t get published; or no one will fund a replication study.
This point is usually an issue of perseverance and re-submission. As an investigator, there is higher academic risk when running a replication study. Professionally, you don’t want to have the reputation of having no original ideas of your own; and if your study just confirms the original study, you can run into the dreaded, “This study does not appear to add to the literature” feedback when you submit to journals. Often in replication studies, there’s only significant gain to be made if your study is contrary to the original one; and to run that study takes a really good knowledge of your field, and a willingness to stick your head out if you’re running counter to the current of your field. It can be challenging to secure funding for a study question that the rest of the field feels has been answered (as those people are the ones judging whether or not to give you money); and it can be equally challenging to get a study published if your peers feel that the first study was definitive enough. However, in the end, it’s a matter of persistence, particularly if you’ve already performed the study. You might not get it into the journal of your choice, but publication is often an issue of re-writes and re-submission. That being said, sometimes, it just doesn’t happen. If you ask any researcher, you’ll find out there’s lots of data that they’ve tried to publish but has for one reason or another, has been deemed unpublishable. Replication studies are only hot topics if original studies are hotly controversial. A good example of this was liberation therapy for multiple sclerosis which was initially touted as miraculous and now…has a much more tempered reputation.
Whether you, as a lay-reader of media reports about science believe a study is more about science reporting, not about studies being ‘wrong’. “Wrong”, like “innovation” is a retrospective concept. It has no context in the present or the future. The progress of science is all about making mistakes, figuring out that mistakes have happened and learning from them to improve our understanding of the world around and within us. There is never going to be a shortage of clickbait looking to pin “wrong” on something or someone—and for good reason: Clickbait about how large mysterious processes are falliable works at getting clicks. Every step in science is a temporary one. Dan Savage has a saying, “Every relationship ends, until you find one that doesn’t,”; and science is not a whole lot different, because every study will be wrong, until it isn’t.