Oncology has no shortage of data.
Cancer registries, electronic health records, imaging repositories, genomic databases, wearable technologies, and patient-reported outcomes generate more information today than at any other point in healthcare history. Yet despite this abundance, one problem remains unresolved: much of this evidence still struggles to become truly scalable, interoperable, and decision-ready.
Over the past decade, artificial intelligence (AI) has rapidly entered healthcare discussions, often accompanied by promises of faster diagnostics, predictive analytics, and personalized medicine. However, while algorithms continue to evolve, the underlying infrastructure supporting real-world evidence (RWE) generation often remains fragmented, inconsistently governed, and difficult to integrate across systems and stakeholders.
This tension became one of the main motivations behind my recent paper published in Discover Health Systems, where I explored how oncology registries could evolve from static repositories into scalable, AI-ready learning health systems.
The paper was shaped not only by regulatory guidance and scientific literature, but also by more than three decades of observing how evidence is generated, interpreted, challenged, and operationalized across the pharmaceutical and healthcare ecosystem. During this journey, one recurring pattern became increasingly visible: the real bottleneck is no longer the absence of data. It is the inability to connect, govern, harmonize, and operationalize data in ways that generate trustworthy and clinically meaningful evidence.
In oncology particularly, this challenge becomes highly visible.
Different hospitals collect information differently. Registries vary in governance models, coding standards, completeness, and interoperability capabilities. Molecular information may exist separately from imaging repositories. Patient-reported outcomes are often disconnected from routine clinical data. Electronic health records remain optimized primarily for operational and billing workflows rather than research-grade evidence generation.
As a result, healthcare systems frequently operate with large volumes of information but limited evidence integration.
At the same time, oncology registries remain among the most valuable and credible sources of real-world data. Unlike fragmented administrative databases or isolated electronic health record extracts, disease-specific registries are purpose-built around clinically meaningful outcomes, longitudinal follow-up, and structured governance. They provide the depth required to support external comparator arms, post-marketing safety studies, natural history analyses, and increasingly, pragmatic clinical trials.
This is why I believe registries will continue to play a central role in the future of evidence generation — but only if they evolve.
One of the central ideas discussed in the paper is that registries should no longer be viewed as passive databases collecting historical information. Instead, they should become active learning infrastructures capable of continuously integrating clinical, molecular, imaging, and patient-generated data into scalable evidence ecosystems.
Achieving this transformation requires more than technology alone.
It requires governance.
It requires interoperability.
It requires incentives aligned around collaboration rather than ownership.
And perhaps most importantly, it requires trust.
The proposed roadmap in the paper therefore focuses on a phased transformation approach.
The first phase emphasizes foundational activation: improving data quality, strengthening governance frameworks, harmonizing coding systems, and maximizing the value of existing infrastructure before continuously creating new disconnected datasets.
The second phase focuses on interoperability and hybrid platform development. This includes federated analytics, linkage between registries and imaging or molecular datasets, integration of patient-reported outcomes, and alignment with frameworks such as OMOP Common Data Model and HL7 FHIR standards.
The final phase introduces AI-ready evidence generation, where registries evolve into dynamic platforms capable of supporting predictive analytics, natural language processing, imaging analysis, and near real-time evidence generation for clinical, regulatory, and policy decision-making.
However, an important concern emerged while developing the paper.
AI alone will not solve fragmented evidence ecosystems.
In fact, poorly governed data may simply produce poorly governed algorithms.
If bias, incompleteness, inconsistent coding, or fragmented governance remain unresolved at the infrastructure level, AI risks amplifying existing weaknesses rather than correcting them. This is why governance, traceability, auditability, and ethical stewardship must evolve alongside analytical sophistication.
Encouragingly, several initiatives across Europe already demonstrate what this future could look like. Federated models, patient-centric registries, interoperability frameworks, and collaborative oncology platforms are beginning to show how scalable, learning-oriented evidence systems can operate while respecting privacy and maintaining scientific credibility.
Still, the transformation ahead is not only technical.
It is cultural.
Healthcare systems must move from isolated data ownership toward collaborative evidence ecosystems. Stakeholders — including regulators, academia, industry, clinicians, and patient organizations — must align around shared principles for transparency, interoperability, and long-term sustainability.
Ultimately, the future of oncology evidence will not depend on who owns the largest datasets.
It will depend on who can build trustworthy, interoperable, ethically grounded learning health systems capable of transforming data into meaningful patient benefit.
Perhaps the most important realization while writing this paper was that real-world evidence is no longer simply a methodological discussion. It is becoming part of the broader infrastructure shaping how healthcare systems learn, adapt, and make decisions in the age of AI.
And that transformation has only just begun.