Eight years after the study began, the results of major cancer biology investigations have finally been reproduced. Cancer research, like the social sciences, appears to have an issue with replication.
The goal of the Reproducibility Project: Cancer Biology was to duplicate 193 experiments from 53 outstanding cancer articles published between 2010 and 2012 by researchers with the group. In two papers published December 7 in eLife, the team reveals that only a fifth of those tests were able to be replicated.
Because they could not get adequate information from the original articles or authors regarding the methods employed or procure the requisite materials for replication, the researchers were unable to complete the majority of their experiments.
In addition, the reproducible effect sizes of the 50 tests from 23 papers were, on average, 85% lower than those reported in the original experiments. In studies, the magnitude of an effect is measured by its effect size. It is possible for two studies to find that a given chemical kills cancer cells, but in one experiment the chemical kills 30 percent while in the other the chemical kills 80 percent of the cells. The magnitude of the effect in the second experiment is around two-thirds that in the first.
Five criteria were used to assess the success of a replication. If both original and replicated experiments yielded results that were statistically significant, they were considered to be statistically significant. A total of 112 tested effects from the tests that could be replicated were able to meet the requirements. The researchers found that only 46 percent, or 51 people, met more criteria than failed.
According to Jonathan Kimmelman, a bioethicist at McGill University in Montreal: “The report tells us a lot about the culture and realities of the way cancer biology works, and it is not a flattering picture at all.” Ethical issues were examined in his co-authorship of an accompanying commentary.
To begin clinical trials or medication development, tests that cannot be replicated are dangerous, adds Kimmelman. “It means that patients are needlessly exposed to drugs that are unsafe and that really do not even have a shot at making an impact on cancer,” he says, if the science on which a drug is based is unreliable.
Even if the findings are encouraging, Kimmelman cautions against jumping to conclusions about the present cancer research system. In fact, he admits, “we have no idea how well the system is working.” Since it is impossible to replicate all studies flawlessly, one of the many unanswered questions by the project is what an optimal rate of replication in cancer research is. This is a moral issue, in his opinion. “It is a question of policy.” The answer to it is not actually relevant to science.”
Extensive inefficiencies in preclinical research may be hindering the drug development pipeline, says Tim Errington, who led the study. At the Center for Open Science, he serves as director of research.
Of the 15 medications tested in clinical trials, 14 were rejected by the US Food and Drug Administration (FDA). This can be due to a lack of financial potential, but it is more likely due to a lack of safety and efficacy evidence required for licensure.
It is reasonable to assume that failure will be a significant part of this effort. According to Errington, “We are humans trying to understand complex disease, we will never get it right.” Perhaps “we should have known that we were failing earlier or maybe we do not understand actually what is causing [an] exciting finding,” he adds in light of the cancer reproducibility project’s findings.
Shirley Wang, an epidemiologist from Brigham and Women’s Hospital in Boston and Harvard Medical School, points out that neither failure to duplicate nor replication proves the validity of a study. Replicate, she explains, and this is something that the reproducibility effort also emphasizes.
The techniques of a study must still be evaluated by scientists to see if they are objective and rigorous, adds Wang, who was not involved in the investigation but assessed its findings. There is a learning opportunity to discover why the outcomes of the original tests and their replications differ, she says.
This is the first time the whole study of the Cancer Reproducibility Project has been disclosed by Errington and his colleagues, who have previously reported on portions of the project’s findings in other publications.
A number of challenges were encountered during the effort, including the fact that the original experiments did not contain enough information in their published articles about how to reproduce them. So the researchers contacted the study’s authors for further information about the studies’ repeatability.
Only 41% of study participants were considered to be extremely or very helpful, while only 30% of study participants were not helpful at all, according to the project’s findings. Some tests required the use of a mouse model bred particularly for that experiment, which the group could not repeat. Because some of these animals were not shared with the reproducibility project, Errington claims that replication was impossible without them.
Brian Nosek, executive director of the Center for Open Science and a coauthor on both studies, says some experts were downright hostile to the idea of independent scientists trying to repeat their work. According to Nosek, this mindset is the result of a research culture that places a higher emphasis on creativity than replication and rewards academics who publish or perish above those who cooperate and share data.
Due to its rarity, replication may frighten some scientists. If replication is commonplace, people will not consider it a danger, according to Nosek. This might be daunting because of the importance of the findings to scientists’ livelihoods and even identity. Publication is the “currency of advancement,” says Nosek, and it can lead to funding, a new career, and a long-term one. “That rewards system does not neatly fit replication.”
In several cases, even writers who wished to help could not disclose their data because of numerous reasons, including lost hard drives, intellectual property limitations, and material that only former graduate students had access to.
For many years, specialists in a variety of fields have been warning of a “reproducibility crisis” in science, with the field of psychology leading the charge. When it came to reproducing preclinical biomedical research findings in clinical trials, Bayer and Amgen experienced problems in 2011 and 2012.
Replication of significant experiments, for example, is a hot topic, but not everyone agrees on the best approach to go about it, or even what exactly is wrong with the way science is done.
Yvette Seger, director of science policy at the Federation of American Societies for Experimental Biology, argues that at least one clear, actionable conclusion arose from the new data. As a result, scientists must be given the opportunity to explain the specifics of their study methods.
Scientists should strive to provide as much information as possible about their experiments in order to ensure that their findings are understood by others, adds Seger, who was not engaged in the reproducibility effort.
A self-correcting discipline requires plenty of possibilities for both making mistakes and identifying those faults, the project’s experts argue, including the ability to replicate studies.
Public understanding of science is generally high, and I believe that the public also recognizes that science will make mistakes, Nosek claims. Concern should and must be, is science good at catching its mistakes??” Although the results of the cancer research project do not always address that question, the difficulties of finding out are highlighted.