Those who do research work are more or less familiar with the four stages of research fraud. At the beginning, it all comes from the request of a p-value that can pass the test. The researcher works day and night, trying to get a set of data consistent with her/his hypothesis. However, the number is always marginal at best. The researcher is eager to publish a good paper for graduation or getting a job, or catch the deadline of grant application. We all know what would happen next: either pushed by the PI or driven by the researcher’s desperation, she/he repeats the experiment for many more times. She/he selects those in close range to give nice p-value, while neglecting all the others. This is the first stage: cherry-picking. When being revealed or challenged, the researcher may defend this action with the non-statistical theory of “removal of outliers”.
If the data go through to get published, the research would feel comfortable to continue the cherry-picking. Not before long it will become legitimate in her/his mind. As this researcher works on more projects, she/he wants to do it more efficiently. If cherry picking is allowed (it’s not), she/he thinks, why not “borrow” controls from other sets of experiments? They have similar design and the control result should be the same, right? Importantly, that can cut back the number of samples to half or save 50% time! From there, we start to see the copy-and-paste in the images or even numbers on the papers.
As the researcher gets used to the maneuvers of cherry-picking and copy-and-paste, she/he will be more confident to generate papers from smaller sets of preliminary experiments. With the taste of successful publications and stress of higher expectation, she/he brings up bolder hypotheses and projects supposedly to give even higher impacts on the field. Sooner or later she/he will encounter the situation that the results of the experiments are either insufficient to prove the hypothesis, or the data is out of the expected range, even the two performance-enhancing approaches cannot rescue it. The researcher is confident she/he is right on the hypothesis, as all her/his previous studies prove it. Absence of evidence is not evidence of absence. She/he decides to fill the blank by putting the ideal number in her/his mind.
This strategy pays back. The researcher publishes papers in high-profile journals with amazing efficiency, secures more funding, and expands laboratory personnel. The success gets her/him promoted in tenure-track positions but also brings in more administrative power. She/he notices that the studies out of cherry-picking, copy-and-paste, and blank-filling still get cited by other papers all the time. Therefore, her/his instinct in scientific research must be extraordinary, and this must be the reason she/he becomes so successful. At this moment, the researcher is not able to focus on research and mentoring in her/his full time. When the graduate students and postdoc show her/him the data, she/he gets a strong feeling about how things should develop and look like, so she/he tells them that’s what she/he wants to see in their next presentation. Under the stress, more blanks are filled, until it is found by a nobody from internet that the whole thing is fabricated. The institute will initiate the investigation, the PI admits the mistake, argues that it technically is an “unintentional” mistake, while hoping the administrative power could give her/him some buffer zone. By the end, some unfortunate first authors will be indicted academically to save the PI’s career, which has already become inseparable from the institute.
This process- cherry picking, copy-and-paste, blank filling, and total fabrication- is uncannily similar to the evolution of cancer. Stress enforces the output of stem cells, increasing replicative errors in them. As the errors accumulate, neutral drift safeguards the rise of a specific subclone. However, when the stem cell niche deteriorates, subclones need to compete for the resource. The mutant subclones that become less regulated exhibit advantages over normal cells. Their “success” requires the support of more resources, and the further clonal expansion allows selection of mutations to gain more dysregulated advantages. From there, it goes all the way spirally to lose all control- invasion, dissemination, and metastasis, etc., until the cancer kills the host.
Like conventional therapies for many types of malignancy, the aftermath investigation and punishment have limited efficacy to eliminate or even just control the recurrence of research fraud. After all, the price-performance ratio in research fraud is way too low. However, multicellular organisms develop cancer suppression mechanisms over their long history of evolution. First, the integrity of the stem cell niche needs to be well maintained to prevent mutant clones from becoming competitive through dysregulation. Second, “caretaker” genes set quality control of replication to minimize errors, and “gatekeeper” genes turn off cell growth when the errors are made and identified. Third, even if the mutant cells escape from the function of tumor suppressor genes, they may still be removed by immunosurveillance.
We should learn from the wisdom of evolution. The research institutes should abandon the current rent-seeking strategy, which focuses solely on pushing researchers to bring in more grants. Like stress forces stem cells to increase replicative errors, such a strategy creates incentives for research fraud. Instead, they need to take their own responsibilities to support and nurture researchers and provide infrastructures, like organisms maintaining the integrity of stem cell niches. Research institutes and publishers should invest in the expertise and tools to detect data fraud before accepting the study results. Fully relying on volunteered peer-review processes is cheap and lazy. Finally, funding agencies should encourage the publication of reproducibility studies, even making it a requirement for renewing grants.