Abstracts! Huh! What are they good for?
I think it’s a pretty safe statement to say that most fitness end-users, and even trainers, do not have the time or the interest (or, in some cases, the access) to read full papers. Most people tend to have easy access to PubMed abstracts, and are quite happy to read the “chunk-style” format of an abstract (thanks, Lou) because they’re generally short, and fairly easy to understand (because brevity forces simplicity most of the time).
However, using abstracts as a basis for decision-making can be very problematic because there are often disparities or omissions in the abstract when compared to the full study itself.
In the research world, abstracts are used as “enticement”, similar to a marketing tool. The abstract gives the reader a taste of what lies within. Most journals have a word limit of 100-200 words for an abstract of a paper that is up to 3000 words long (not including graphs and tables). Reputable researchers will never use _only_ the abstract to help them design a study or to support their research. And similar to the “enticement” of biting into a chocolate, you never know if it’s going to be the one with the gross marachino cherry in the middle.
So, I guess my first piece of advice if you’re one of those people who cites studies based on abstracts, is to be very careful, because it’s very likely you’re missing something very important. And my second piece of advice is that if you stumble on an abstract that makes you very excited, then maybe it’s not such a bad idea to track it down and read the whole thing.
That being said, this article is for those of you who don’t have, or don’t want access to full articles and who go the already-extra step of finding the study that supports or doesn’t support your particular ideas about a topic.
Human studies that involve an “intervention” (whether that’s a training program, a supplement, a drug, a therapy or surgery) generally fall under a fairly small number of broad categories:
1) The case series study: This is your classic, “We gave 10 people a supplement for 4 weeks and found that it made them stronger” study. It’s just a report of several single cases, amassed together.
2) The parallel case series study: This is the study that says something like, “We divided a group of people into 2 groups: one treatment group and one placebo group and give them a weight training program or a walking program and found that people on the weight training program were stronger than the people on the walking program.” There’s a control group of some kind, but it’s not randomly assigned. So, we’re still looking at the report of several or many single cases with some sort of biased comparison group.
3) The case-control study: These don’t happen very often in fitness, but they crop up once in a while. “We looked at two groups of people: One group of people who were fat and another group of people who weren’t fat, and looked at whether they came from rich families or poor families. We found that there were more fat people who came from poor families and there were more thin people who came from rich families,” is a rough example of a case-control study. This type of study is different because we look backwards in time. We pick people who have already had an outcome of interest (the cases, in this case, fat people) and we also pick people who don’t have the outcome of interest (the controls, in this case, not fat people.) We look backwards in time at something we know happened before they got that way (in this example, parental income when they were a child) and look at whether that differs between the cases and the controls. This design can be very powerful, but is tricky to use if you don’t know what you’re doing.
4) The randomized controlled trial: This is the gold standard study, “We randomly put people into two groups and gave one group surgery, while the other group got physiotherapy. We found that the group that got surgery did better than group that got physiotherapy.”
In the research world we have a scale of evidence quality that goes from Level 1a to Level 5, with Level 1a being the highest level of evidence. The well-designed randomized controlled trial falls in the Level 1 category. Everything else falls in the Level 2 or below category. I won’t bore you with the rest of the rating scale because it also depends on the quality of the study as to which Level you end up placing it.
If you can’t do a randomized controlled trial (and this situation is actually quite rare), then we would prefer to see prospective data with a control group (the parallel case-series design). After that, it’s a toss up between the case-control and the case-series design, with the case-control coming out slightly ahead of the case-series.
So, as a reader of abstracts, this is usually information you can identify readily in an abstract. And that’s probably the most useful piece of information you can extract, because right away, you can mentally rank how good this evidence is going to be. A case-series design is automatically going to be less powerful than a randomized controlled trial, regardless of how well it’s done (yes, there are exceptions, but those are finer details that come with experience and more education). Realizing you have weaker evidence right off the bat can be a very useful way of deciding how much stock you’re going to put into it.
The second piece of information that you can sometimes glean from an abstract is what population was studied–and whether you, or the clients you’re reading this for fit in that population. If they don’t, the study automatically has less relevance for you as the reader. It doesn’t mean it’s useless, but it is definitely less useful. A good example of this is the classic women vs. men scenario where the vast majority of fitness studies have been performed with males only. Technically, this study does not extend to females because there were none studied! But you have to be able to ask (and possibly answer) the question of whether gender is actually important. Yes, there will probably be some kind of difference between genders, but there may also be benefit even if you were of the opposite gender of the study, unless there’s something critically different between women and men that would make the study not applicable to the opposite gender. Unfortunately, this judgement call is something you learn to make over time and requires an above-“normal” working knowledge of what differences might be important and why.
The sad part about this blog entry is that this is pretty much where it ends. The reality is that you can’t get a whole lot out of an abstract (as a general statement). I’ve read great abstracts in submitted manuscripts to a journal where I’ve stopped reviewing the manuscript and rejected the paper (i.e. the study does not get published) after the methods section. I barely even glance at the results section, because if the methods are critically flawed, then the results are invalid. Abstracts tend to be results-oriented because that’s how they entice the reader to dig out the article. So we have this paradigm where the really important part of the study is not the part that is emphasized in an abstract, which makes it even less possible to decide whether the results you’re so excited about are actually even worth reading.
Critical appraisal is as much a skill as program design, or physical skill acquisition. It requires a broad base of knowledge and consistent practice to do well. But as they say, “Practice doesn’t make perfect. Perfect practice makes perfect.” If it’s something you really want to do, then you should learn to do it from someone who already does it well, because I can tell you that even the vast majority of Bachelor level “scientists” are inadequately prepared to do it.
My advice when it comes to abstracts is to interpret them with a healthy dose of caution. They were never meant to be used as “chunk-style” information upon which to base an action and really shouldn’t be used that way.