In my last post, I gave an overview on the basics of botanical research. To compile meaningful information you frame your question, structure your inquiry, search the appropriate databases, store and organize your information for easy access, and understand that you have to interpret the information. This last step is the hardest part.
After you’ve completed your literature search, you’ll want to look at the quality and nature of the information in front of you. The best way to assess the strength of the evidence is to know the research methods that were used–and their limitations.
In vitro studies
In vitro means ‘in glass’, and describes laboratory research conducted on cells or molecules outside of their biological environment. It is the least expensive and easiest type of research to conduct in comparison to other methods. It’s therefore used in bioprospecting and pharmaceutical screenings. Often, a group of researchers hear of a traditional use of a particular plant and that’s their clue to investigate its biological activity starting in vitro or in animals. These are also known as mechanism of action studies, as studying those cells up close can shed light on how particular herbs or compound are affecting certain cells.
In vitro studies usually test isolates or purified extracts. Sometimes, as in the case of oncology bioprospecting at the National Cancer Institute, important compounds like tannins are removed (Mills & Bone 2013). Tannins bind nonspecifically to many proteins and enzymes, and the removal of them drastically changes their biological activity. These purified extracts or compounds are added to a culture medium and incubated with cells, and changes are noted and documented. This brings us to a serious drawback of in vitro data: the difficulty of extrapolation.
For example, a 1999 in vitro study tested the effects of several botanical extracts (Echinacea, Ginkgo biloba, Saw palmetto, and St. John’s wort) on fertility. To do this, they cultured hamster oocytes (eggs) with pretty high concentrations of these extracts (upwards of 0.6mg/mL) and then tested sperm penetration (Ondrizek et al 1999). When you’re reading a study or abstract, try to envision what’s actually going on. Is it relevant? For reference, concentrations above 0.1mg/mL are unlikely to be achieved in people taking herbs orally.
Furthermore, and more troubling, is that you can’t get product preparation or extraction details from an abstract or even a full text sometimes. I’m consistently surprised to see how often authors neglect to describe the extract type (aqueous or ethanolic? Crude herb?). This is especially prolific in the mushroom literature, where extraction methods are crucial yet strangely absent from the abstracts.
In vitro data should be carefully examined before conclusions are made. Again, this is usually Step 1 in the process of seeing if a particular herb merits further study.
Animal and in vitro studies are known as preliminary or pre-clinical data. Substances are tested on animals for safety, or the animals are given a test to assess the biological effects of said substance. Because mice and rodents share so many genetic and biological similarities to humans, they are used in 90-95% of animal experiments. The mouse model is not optimal for everything. For example, recent studies have demonstrated the limitations of this model in inflammatory conditions. (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3587220/)
There are significant issues in animal research that every researcher or analyst needs to be aware of. There are animal welfare guidelines in place, but enforcement is far from perfect. Cruelty in animal research is still prevalent.
Though I understand the importance and utility of data generated from these models, these can be difficult papers for me to read and review. For that matter, there are ethical issues in every in vivo and clinical trial model. Even in vitro work is not exempt from humanitarian considerations, as was so eloquently highlighted for HeLa cells by Rebecca Skloot in The Immortal Life of Henrietta Lacks. Developments in tissue engineering (like lab grown organs or organs on a chip) will hopefully provide alternative experimental models and reduce the need for animal involvement in research.
These are small scale tests, usually with 10 people or less, that test procedures to be used in larger groups. Safety is usually first established through animal tests. These are also known as feasibility studies: if the intervention being tested is (most importantly) safe and may be effective, it progresses to larger trials with more people. Pilot studies are not really testing hypotheses quite yet. So if you see a small sample size (the “n” number), and there isn’t a control or placebo group, it’s a pilot study. And while it’s more relevant than animal or in vitro data, since the sample size is small it’s hard to extrapolate.
Randomized controlled trials
RCTs are the ‘gold standard’ of evidence based medicine (EBM). These trials are usually double-blinded–both the participants and researchers are unaware who’s receiving the test substance and who’s on placebo. Here is a typical flow chart for RCT design-
Particpants are assigned to experimental or control groups on a random basis in an effort to avoid bias. RCTs report statistically significant associations by reporting a P value. A P value is the probability that an observed effect could be the result of chance. You want P values to be <0.05 for it to be considered statistically significant. This means that there is <5% chance that the observations are due to chance. This is a point of confusion–many people think that P values indicate probability of a treatment working.
RCTs can provide more practical data than the previously mentioned methods. There are more people used, and confounding variables (things that might skew results) are attempted to be eliminated. But there are serious difficulties with RCTs. The first is ethics. Is it ethical to allocate a patient into a placebo group when they may be offered a life saving (though experimental) drug? This issue became especially pertinent in the Promoting Maternal–Infant Survival Everywhere (PROMISE) study earlier this year, where the original study design would have allowed pregnant HIV-positive women to receive substandard treatment.
Secondly, there are many ways in which trial data can be manipulated. Statistical tests determine data results and conclusions, and there are many ways that the raw data can be (like excluding drop outs, for example). (Anyone interested in clinical trial misconduct should read Ben Goldacre’s work, particularly Bad Pharma.) Fred Menger said, “If you torture data sufficiently, it will confess to almost anything.”
Finally, while RCTs give useful information because they involve more people, they are not representative of many people. Participants are usually not eligible for participants if they have a cormorbid condition. How many clients or patients do you know with only one neat health condition? In this way, pooling data from eHealth records can offer more real-world information. Another serious limitation of RCTs is that it does not provide explanations as to why something might work, only associations between an intervention and outcome.
These are studies that observe a group of people to draw inferences of the effects of a treatment. This contrasts an RCT where investigators randomly assign people to a control or experimental group and test a substance. There are a few types of observational studies.
- Longitudinal studies tend to be utilized more in public health research. They involve repeated observations of the same variables (smoking and lung cancer incidence, for example) over time. Usually for decades and sometimes generations. They also involve large numbers of people, and it’s not uncommon for researchers to use these huge data pools to form new associations between different interventions (or supplements) and certain risk factors. These are usually prospective–they follow a group’s changes in health over time. Or they can be retrospective and look backwards. A variation of this is a cohort study (which follows people over a smaller time span) and panel study (which looks at a cross section of a population).
- Case control studies are frequently used in epidemiology. For these, researchers look at 2 different groups exhibiting different health outcomes, and try trace back the cause through comparison of the cases.
Observational studies are infrequent with botanical interventions. But they are common with supplement use and public health outcomes.
Case reports and series
Case reports are submitted to journals by one or more clinicians on a particular case. These are frequently used for adverse effects in botanical medicine and herb drug interactions claims. They are notoriously prone to errors and serious bias. Fear mongering in case reports is rampant with titles like Coma from the health food store: interaction between kava and alprazolam and Dying for a cup of tea.
Careful examination of case reports is necessary for any serious review of botanical safety. Medical herbalist Jonathan Treasure (my teacher and mentor) came up with an evaluation scheme for these reports.
- Positive ID of the herb
- Adequate description of case
- Plausible pharmacological timing
- Other explanations ruled out
- Concomitant medications noted
- Confirmation by objective measures (ie serum levels)
- Plausible or established pharmacological mechanism
- ADR ceases on withdrawal of herb
- ADR event reproduced by rechallenge
- Previous exposure linked to same ADR
All of the research methods listed so far describe primary research–that is, it is research done by directly collecting research for an experiment. Secondary research is the summary, collation and/or synthesis of existing or primary research.
Narrative reviews are a type of secondary research. They involve one or more authors looking at the current state of the evidence and writing a review. It’s customary for them to list their review methods (databases searched, keywords, etc). But these often read as descriptive research reports and can come with varying degrees of bias (especially selection bias, with picking preferred papers and research to include while ignoring others). They not strong in terms of supporting an evidence base, but they can be helpful papers in educational settings and give you more references to check out.
Narrative reviews are especially prone to biases from the reviewer. Systematic reviews are more structured in their discovery and analysis of the literature. The Cochrane Collaboration is a research group that produces the best quality systematic reviews available. This group is extremely thorough and has been known to directly confront researchers on trial or research misconduct. These reviews, however, usually conclude with “more studies are needed” especially where botanicals are concerned.
This is similar to a systematic review in that it’s a highly structured analytical method using primary clinical trial data. However, there’s a key difference. Systematic reviews use the conclusions of trial data. Meta analyses uses the raw data from trials and pools it together for a larger statistical analysis. So it aggregates previous information to give a greater sense of effect with more statistical power.
These are some ways that research is conducted, and comprise an evidence base. In subsequent posts I’ll address some of the philosophical issues in applying evidence to a situation. For now, if you’re interested in these deeper philosophical considerations, I highly recommend the works of David Sackett and Nancy Cartwright.
Do you have questions about methods? Leave them in the comments!
Ondrizek RR, Chan PJ, Patton WC, and King A. An alternative medicine study of herbal effects on the penetration of zona-free hamster oocytes and the integrity of sperm deoxyribonucleic acid. Fertil Steril. 1999;71(3):517-22.
Mills, S and Bone, K. 2013. Principles and Practice of Phytotherapy, 2nd Ed. London, Churchill Livingstone.