Hugh G.G. Townsend (He/Him)

Professor Emeritus, Department of Large Animal Clinical Sciences, University of Saskatchewan
  • Canada

Topics

Channels contributed to:

Behind the Paper

Recent Comments

Oct 11, 2025

Thank you for drawing attention to PREPARE: Guidelines for planning animal research and testing, an important document and associated website that offers much valuable guidance for researchers during the design and conduct of animal studies. Of particular relevance to our paper is the detailed supplemental information linked at norecopa.no, which expands on Section 4—Experimental design and statistical analysis.

Our paper, however, focuses specifically on the four fatal flaws that compromise the validity of most, if not all, laboratory animal experiments. If these flaws—related to failures in full randomization, full blinding, control of cage effect, and the use of the correct unit of analysis—are not rigorously addressed at the design and analysis stages, the resulting data are inherently unreliable and cannot support valid statistical inference.

Given the opportunity, we will strongly recommend that funding agencies verify that these core design elements are in place before a study proceeds to peer review. Without this foundation, even the best-intentioned initiatives such as PREPARE and ARRIVE cannot ensure that results will be credible, repeatable, reproducible and translatable to human and veterinary medicine.

Oct 10, 2025

Thank you very much for your comments. These are very important points.

Getting this message out and overcoming deeply ingrained approaches to study design that stretch back more than 100 years will be very challenging. Our initial action has been to publish our paper and through this, give the scientific community a clear description of our position on this issue. In the coming days we will make a joint press release, aimed at getting the attention of the scientific community and the general public. At the same time, we are in the process of opening discussions with major funding agencies as we believe they can act as a major control point to assure that poorly designed research does not receive their support. To keep our thinking practical, we are meeting with various laboratory groups and ethics boards to get their reaction to our ideas and recommendations. We are also meeting with university officials to discuss the establishment of training programs aimed at producing experts in study design and analysis. Where we go after that is still being considered and we will certainly pay attention to your recommendations and any others that we receive.

We are fully supportive of the ARRIVE guidelines, though our emphasis differs in important ways. ARRIVE focuses on the publication process, while our concern is with the planning stage of research. For this reason, we are more prescriptive and less flexible about critical issues related to bias, such as blinding, randomization, addressing cage effects, and using the correct unit of analysis. With respect to blinding and randomization, this distinction makes sense: no study design can anticipate every possible source of bias, so some level of imperfection will always need to be tolerated in the reporting of research.

Oct 09, 2025

Thank you very much for your thoughtful and constructive comments.

We agree that graduate students in the biomedical sciences need a reasonable knowledge of statistics. A basic grounding in statistics is essential for designing unbiased studies that produce data suitable for analysis. This does not mean every investigator must become a statistician. Rather, it means they should have enough understanding to know when expert support is needed, and to be able to critically evaluate the validity of studies on which their own work depends. We also agree that dedicated training in study design would prove to be an invaluable, if not essential addition to their graduate education. This would greatly strengthen the long-term success of future researchers.

Your point about the importance of a clearly stated null hypothesis is well taken. In reviewing the literature, however, we found no examples where laboratory animal studies specified a null hypothesis. One reason is that these studies are seldom structured as true experiments, testing a single factor against a single outcome. Instead, they often include multiple factors and outcomes—many chosen only after data collection. In this sense, most laboratory animal studies resemble observational studies, which, if properly designed, are useful for generating ideas but not for formal hypothesis testing.

We also appreciate your suggestion about recognizing when data may not follow a normal distribution and when non-parametric methods might be more appropriate. We did not explore this in our paper because, in practice, we rarely found studies where the design allowed for a valid statistical analysis of any kind. Our main goal was to highlight design flaws that make proper analysis impossible from the outset. Once data from more rigorously designed studies are available, we believe a next step will be to examine how statistical methods—including parametric and non-parametric approaches—are applied or misapplied in laboratory animal research.

Oct 08, 2025

Reply to Comment

Thank you for taking the time to share these important perspectives from the cancer research field. We are pleased to have the opportunity to address you challenges and agree with several of the points you raise, particularly that:

  • Cancer is indeed a heterogeneous disease, and no single mouse model can represent the diversity of human tumors. Using a panel of models is a sensible way to address this.
  • Housing design matters: co-housing mice that are assigned to different treatments and then kept together in the same cage may bias results if there is a risk of clinically significant cross-contamination with pharmaceutical products through urine and faeces. Under these circumstances treatment must be randomly assigned to multiple cages, each housing one or more mice. One mouse per cage is costly and not considered humane. More than one mouse per cage is more humane but even more costly. So commingling is best unless you know that cross-contamination will adversely affect the results. If treatment is assigned to cage then the sample size is equal to the number of cages assigned to each treatment, not the number of mice assigned to each treatment. Treating the individual animal as the unit of analysis produces pseudoreplication of data.
  • Sample sizes and model selection must address heterogeneity and generalizability to the clinical setting.

At the same time, we respectfully disagree with the suggestion that randomization, blinding, controlling for confounding, and statistical testing are unnecessary in this context. Panels of models are only as reliable as the individual experiments that comprise them. If each study within the panel is biased in design or execution, the conclusions across the panel will also be biased.

  • Randomization remains essential to avoid allocation bias, even in multi-model trials. Without it, treatment effects may be confounded by systematic differences in how animals are assigned. Even in single mouse studies, randomization to cage and treatment is essential to be sure that the treatment groups are balanced with respect to all confounders that cannot be controlled through analysis. Further, any other procedures performed (e.g. cage placement, sample processing, histology) that may result in biased outcomes must be addressed through randomization.
  • Blinding is generally feasible at the stage of outcome assessment (e.g., tumor measurement, histology, imaging), even if treatment administration cannot be blinded. This greatly reduces risk of detection bias. Having said this, failure to blind investigators during any procedure that may influence outcome assessment, for any reason (e.g. because it is challenging, inconvenient or just not possible) increases risk of bias. As with randomization, blinding must be full/complete with everyone who has any potential to influence study outcome, including the statistician, must be blind to group assignment.
  • Statistical analysis is still required to determine whether observed differences are larger than expected by chance. Descriptive evaluation alone is insufficient when making efficacy claims, especially in studies intended to inform human trials.
  • Cage Effect will be present in every laboratory animal study and must be addressed appropriately during both the design and analysis of every study.
  • Correct unit of analysis – This is addressed above. Pseudoreplication was present in all but a small number of the studies that we have reviewed.

We therefore believe that rigorous experimental design principles apply equally to cancer models as to other preclinical studies. Panels of tumor models can improve representativeness, but studies that do not address the above within each model will yield unreliable results and conclusions. As emphasized in our paper, such studies are not ethical because they waste animal lives, resources, public funds (in many cases) and lead to the miseducation of young scientists.

We hope this is helpful and look forward to further discussions.

Details

Online Elsewhere