AI-Driven Precision Oncology
Precision oncology promises a shift from "one-size-fits-all" treatments to tailored therapies, customized for individual cancer patients. This approach, which is often costly and time-consuming, involves extensive evaluation of drugs on patient cancer cells in the lab. AI has emerged as a powerful tool to accelerate this process, using vast datasets from past drug-cell interactions to predict optimal treatments based on a patient's unique biological profile. This shift has been enabled by large databases such as the Genomics of Drug Sensitivity in Cancer (GDSC) and the Cancer Cell Line Encyclopedia (CCLE), and initiatives like the NCI-DREAM Drug Sensitivity Prediction Challenge 🤖
From Lab to AI: Testing New Grounds with Organoids
The Department for BioMedical Research of the University of Bern collaborates with clinicians to grow organoids—3D self-organized cancer cells—and test various drugs on them. Together with researchers from IBM Research, the goal of this study was to see if AI's success in predicting drug response with cell lines could translate to these organoids.
To explore this idea, we utilized an existing dataset containing genomic profiles of pancreatic cancer organoids and their drug responses. Using both simple models and state-of-the-art AI, we initially saw promising results. In fact, these results were suspiciously good 🕵️. It made us doubt that the models were actually learning from the genomic data. To test it, we replaced the gene expression values in our models with zeros. The prediction quality surprisingly stayed almost the same or even improved.
We tested this anomaly further with the GDSC data in a setting similar to the NCI-DREAM challenge. The results? Changing genomic data had minimal impact. This discovery was alarming: the models indeed weren't learning from genomic data; they were merely echoing average drug responses from the training set. The more experiments we ran, the more we grew a suspicion: maybe only the drug and not the cell line governed the drug response. But what do we mean by drug response exactly? 🤔
Questioning the Status Quo
The most popular measures of drug response are the half maximal inhibitory concentration (IC50) and area under curve (AUC). Essentially, they measure how effective a drug is at stopping a particular process, like the growth of cancer cells. We computed correlation of drug responses for each two cell lines/organoids and took an average for each database. The correlation values proved to be high, confirming that the variation of drug ranking was low❗
We next analyzed drugs identified as highly effective by IC50/AUC metrics and discovered that they were generally toxic. This dominance of toxic drugs according to IC50/AUC has previously faced criticism. Moreover, the minimal variation in drug ranking rendered this data inadequate for crafting personalized predictions. Nevertheless, the vast majority of AI models for drug response prediction were trained to predict IC50 😯
This raised critical questions: was the idea of personalized oncology models trained on these datasets merely an illusion? Or can we adopt another measure of drug response to make predictions meaningful? 🦸
Rethinking Measures
To minimize the influence of toxic drugs, we applied a z-score transformation to the IC50 and AUC values. This transformation shifted our focus from how many cancer cells a drug killed to how its effect on a particular cell line or organoid differed from the average. With this approach, cancer clinicians could potentially recommend treatments that are more effective for specific patients avoiding toxic drugs.
When we analyzed the transformed data, the drug response pairwise correlation across all datasets was approximately zero. This indicated that the z-scored datasets could support personalized predictions. However, the success of our predictive models was mixed 🤨
Neural network models designed to predict drug responses for any drug, whether known or new, still struggled to learn from the genomic features and failed to make accurate predictions in the z-scored setting 👎. On the other hand, we successfully trained simple models for individual drugs and utilized genomic features 👍
Looking Forward 👀
Our findings emphasize several crucial points:
- For nearly a decade, AI models were trained and tested using drug response measures that don't support personalized predictions. To avoid similar issues in future, it's crucial to test baseline models, especially if it hasn’t been done for the state-of-the-art methods.
- Using z-scored drug response measures overcomes limitations of current drug response measures and could pave a way to enable personalized predictions.
- Personalized AI-based drug response prediction remains a challenging task. Pan-drug models that can handle unseen drugs need to be improved to be able to learn from genomic data. Meanwhile, simpler models that focus on individual drugs might be effective when there is enough training data available for each drug.
This study calls for a fresh look at how we model drug responses in precision oncology. It highlights the necessity for innovations that genuinely exploit AI's potential.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in