0. Basic questions
In studies of diseases, there are two types of models: evolutionarily conserved system and statistically effective system.
The first is based on systems exhibiting driving factors selected by evolution. Adaptive immunity is selected and conserved across mammalian animals. T cell subclones work by the same principles in mice and human, as both species are prone to similar types of viral infection, for example.
The second is based on effective sampling to represent the whole population. Yes, clinical trials are models of general patient population. The variation within trial cohort is supposed to happen in the same manner in the whole patient population.
When we talk about models, we need to understand what modeling is and how it is run. The arguments like "mouse is not human" and "genetic diversity is impossible to model" are not only misunderstanding, but also falsify all models together, including in vitro and in silicone models.
1. Mismatch leads to misunderstanding
Some claimed "many drugs cured cancer in mice, but most of them failed in clinical trials, so mouse study is useless."
This is totally not true. During my career, I've run more than 100 preclinical studies of therapeutic agents (including chemo, targeted, and immuno-therapies) using mouse models of melanoma, breast cancer, and lung cancer. Very few of them showed responses to whatever kinds of therapies.
So where is the misunderstanding coming from?
- The setting and endpoints of mouse studies are very often different from those of clinical trials. Therefore, the outcomes of mouse and human studies could not be compared directly. This is the translational gap.
- Many mouse models were engineered to test specific mechanisms or hypotheses. They represented very special cases of the diseases, and their study outcome cannot be generalized to general patient cohorts.
- Ironically, the number of mice in a preclinical study is often much smaller than the number of patients enrolled in a clinical trial. The statistical power of preclinical studies were much limited, and its translation to clinical studies needs helps from biostatisticians- and this was lacking in many publications.
Many details deserve further discussion or even a commentary for publication. However, very importantly, these issues actually appear in every type of models. Replacing one type of model with another will not solve the issues. It will just repeat them.
2. Settings of the preclinical study and their translation
In most animal cancer models used in preclinical trials of anti-cancer therapies, the tumors were implanted in the body surface. This is for convenient measurement of tumor size, a technical maneuver rather than a biological approach. I will discuss the biological relevance of animal model design in next post.
Let's use subcutaneous transplantation mouse model as the example for discussion. When the implanted tumors reach a pre-determined size (e.g. 75 mm^3), the treatment starts. Usually, the endpoint is set when the tumor reaches a pre-determined size or the host shows behavior of sickness. The time from tumor implantation to the endpoint is defined as survival time. The prolonged survival is actually tumor growth delay time in the treated group vs. control group.
We need to understand: even as growth is significantly delayed as compared to the control, tumors still keep growing in most of cases. When we project this situation to human patients, this is called progression disease, so there is no efficacy. If the growth delay involves a period of stable tumor size, it may be similar to progression free survival (PFS). However, it's not the case in most of the published studies. Therefore, it's very obvious why drugs showed wonderful "efficacy" in such mouse studies frequently failed in clinical trials, because it is not the efficacy in human patients. Mouse models predict the right results which are misunderstood by investigators.
3. Source of variation in mouse cancer models, and the good practice for reproducibility
Every model has limitation in the range what it can model. Injection of cancer cells into mice is not how cancer occurs in human patients. As soon as cancer cells implanted into mice, the following will happen:
1. > 90% of cancer cells will die because of lacking nutrients and attachment, and/or innate/adaptive immunity. This will cause variation in the number of mice which tumors can grow (so-called "tumor-take rate").
2. In immunocompetent mice, the inflammation continues until the expanding tumors established immunosuppressive environment, so they begin exponential growth phase. The subtle difference in immunity, even for inbred mice, can result in significant variation in the initiating time of exponential growth phase. This is why the diverse growth pattern always happens in any preclinical study using immunocompetent mouse cancer models. The uniform tumor initiation pattern in any published data is always suspicious.
3. The timing to start therapeutic treatment. Any tumor needs to built immunosuppressive environment to keep growing. In transplanted mouse cancer model, this usually corresponds to reliably measurable tumor size (e.g. at least ~ 20mm^3). In many published studies, the therapeutic treatment started at day 3 after tumor implantation. The strong inflammatory at that point can make unsettled cancer cells vulnerable to any stress, making the therapies work especially well. This is artificial efficacy.
4. How to run a study with all of these variations? The purpose is to test therapies on tumors as similar as to human tumors for sure. Therefore, cancer cells should be injected into more mice than the number required for the study. The actual mouse number to receive injection depends on tumor take rate. For example, if the tumor take rate for a specific melanoma cell line in C57BL/6 mice is 90%, and the study needs 10 mice (the statistical power needs to be explain in another post), it's better to inject 15 mice. After a period of tumor growing, 10 mice in which the average size of tumors get to the pre-determined level (e.g. 75 mm^3 in my protocol) will be included into the study. Like patient inclusion in clinical trial, the mouse selection process needs to be pre-defined in study protocol, and the repeating studies need to follow the same protocol.
Study protocol including these considerations can make sure reproducibility. Before discussing "reproducibility crisis", we should ask: what's in your study protocol?
4. Scaling of preclinical models: the most important factor in the translation of modeling output
Many people like to say "mouse is not human" when arguing about models. However, what is the biggest difference between mouse and human? No, it's not species. It's a lazy answer like saying "A is not B because B is not A". The answer: size. Size dictates the ranges of metabolic rates, lifespan, fertility, immune response, etc., which determine the "design" of genetic circuits. If this topic interest you, please go read "Scale" by Geoffrey West.
Scaling is the basis to interpret the results of preclinical studies using mouse models. Let's start from therapeutic outcome. We see often that a therapeutic target was claimed "effective" when its inhibition caused tumor growth delay for 10 days in mice. Mice can live up to two years; in a trivial over-simplified manner, their one day is equivalent to one week of human being. The period of 10 days would be equivalent to 10 weeks = 2.5 months in human, which are often insignificant (within error range) in clinical trials.
The next is dose of the therapeutic agents. Maximum tolerant dose (MTD) is often used in mouse study. When scaling it to human body size, the equivalent dose is often toxic for human being. Without PK study to identify humanized dose, the mouse study is meaningless.
Let's also consider efficacy. If we model tumor growth in mice with exponential growth equation, growth delay is caused by the reduction of cell number during drug treatment without altering growth capacity. We can calculate the percentage of killed cells by the period of growth delay. Very roughly, if tumor growth rate is 1.25/day, growth delay of 10 days is caused by killing 90% of the tumor cells. The relation between tumor response and survival needs to be scaled to human size to translate the preclincial results to human patients.
Currently, when discussing in vitro models to replace animal models, scaling is hardly mentioned. This will become a problem for the translation of "new" models.
5. Representation of models
When someone says "mouse models are not working", we have to ask: which specific disease does the model in your comment represent?
The first layer is disease subtype. For example, breast cancer has several histological subtypes: luminal, HER2 positive, triple negative, etc. and they respond to therapies very differently. Modern clinical trials are designed to test therapies in selected patient cohort, including disease subtype. It will be ridiculous if model design do not consider it.
Second, is the model developed from matched tissue of origin? In the case of melanoma, UV induced cutaneous melanoma, acral melanoma, melanocytic melanoma, mucosal melanoma, uveal melanoma have different cells of origins. Everyone uses mouse B16 and human A431 melanoma cell lines. What are their cells of origin? If we do not know the answer, how can we translate the results of the tests on them?
Third, if we don't know the tissue origin of the model, like B16, we need to characterize it's genotypes and phenotypes, at least figure out what signaling pathways and immunophenotypes it can represent. Actually, B16 cells have mutations in Gnas and defected antigen presentation pathways. We can project that it can represent melanoma with similar features.
These are consideration to apply to any model. If won't make sense to think in vitro or in silicone models do not need that.
6. My criteria for a useful cancer model
(1) Characterization: what are the "identity" features of this model? They can include genomic alterations, gene expression, epigenetic markers for the biological systems (in vitro or in vivo), or parameters in the computational models.
(2) Representation: What exactly the disease, its subtype, or part of its features does the model match to? How to measure the similarity of the model to the real system?
(3) Relevant phenotypes: Does the model exhibit phenotypes that respond to the procedures for testing the hypothesis? For example, metastasis, therapeutic resistance, immune suppression, etc.
(4) Model validation: Does the model allow periodical testing to measure the deviation from the initial state?
(5) Study outcome validation: Can the outcomes of the study using this model be compared to the data from the real system represented by the model?
Feel free to use these criteria when you review a manuscript or research proposal!
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in