Behind the Paper

How can we measure the evolution of health research? A data-driven approach across funding systems

We developed a scalable method to classify health research across 26,000+ projects and their publications, combining expert knowledge and machine learning. This allows us to track how funding priorities translate into research outputs across systems and over time.

Published in Research Data, Public Health, and Statistics

Mar 28, 2026

David Fajardo Ortiz

Lecturer, Universidad Nacional Autónoma de México

Liked by India Ambler and 1 other

Explore the Research

How can we measure what health research actually is?

Debates on research funding often rely on broad categories such as “basic” or “applied” science. But these distinctions are rarely measured in a systematic and comparable way.

In our recent study,¹ we developed a methodological framework to address this challenge. By combining large-scale text analysis with supervised machine learning, we analyzed more than 26,000 funded projects and their associated scientific publications across European and U.S. funding systems.

A conceptual framework for classifying research

At the core of our approach is a classification system grounded in two dimensions:
(1) the unit of analysis of research—from molecular and cellular mechanisms to population and health systems—and

(2) the orientation of research from basic to applied.

This framework allows us to distinguish five levels of health research, ranging from basic biomedical science to health policy and management. Importantly, this is not just a keyword-based classification, but a conceptually grounded system aligned with how health research is understood in practice.

From expert knowledge to machine learning

To scale this classification to tens of thousands of projects, we used a supervised machine learning approach.

We first constructed a training set based on expert annotation. These manually classified examples were then used to train a Naïve Bayes classifier, implemented in KH Coder.

The model was iteratively refined and validated, achieving around 82% agreement with expert classifications for projects and up to 95% accuracy for publications.

This approach ensures both scalability and interpretability—two key requirements for policy-relevant analysis.

Linking projects to publications

A central innovation of the study is the integration of funding data with scientific outputs.

We linked funded projects from CORDIS and NIH RePORTER to their resulting publications.

This required addressing important differences between systems. While European projects can often be directly linked to publications, NIH data required a time-window approach due to the cumulative nature of funding and publication processes.

By combining both datasets, we were able to compare not only what funding agencies aim to support, but what research is actually produced.

A multi-layered analytical strategy

Our methodology combines three complementary components (figure):

Keyword-based content analysis
Supervised classification
Comparative analysis across funding mechanisms and time periods

The convergence of these approaches increases robustness and allows us to detect consistent patterns across different types of data.

Why this matters

This methodological framework moves beyond descriptive analyses of funding trends. It provides a way to empirically assess how policy priorities are translated into research activity and outputs.

More broadly, it opens new possibilities for studying how research systems evolve—and how funding shapes the direction of science (figure).

Figure. Conceptual and analytical workflow for the classification of health research.
The figure illustrates the integration of funding data and scientific publications through a common text-mining and supervised classification framework. Projects and publications are classified into five levels of research, enabling comparison between funding priorities and research outputs.

Reference

1. David Fajardo-Ortiz, Bart Thijs, Wolfgang Glänzel, Karin R. Sipido; Evolution of public funding for collaborative health research towards higher-level patient-oriented research. Quantitative Science Studies 2026; doi: https://doi.org/10.1162/QSS.a.472

David Fajardo Ortiz (He/Him)

Lecturer, Universidad Nacional Autónoma de México

My research sits at the intersection of biomedical and health sciences, the science of science, and complex systems. I use scientometric and data science approaches to examine questions of equity in health research and technological development.

I obtained my PhD in Health Policy and Management from the Universidad Nacional Autónoma de México (UNAM) in 2016, where I was awarded the “Alfonso Caso” Medal for the top graduate of my cohort. I subsequently taught at the UNAM Faculty of Medicine, delivering courses in public health, health information systems, and complex epidemiological systems.

I was awarded a Humboldt Foundation postdoctoral fellowship to conduct research in Berlin, Germany, focusing on the structure, dynamics, and funding of research on CRISPR genome editing technologies. I later held a second postdoctoral position at KU Leuven (Belgium), where I investigated the evolution of health research funding in the European Union and the United States.

I am currently a lecturer at UNAM, where I continue to study the global organization of health research, with a particular focus on the Global South and the implications for science policy and equity.

My work has informed international policy discussions; according to the Overton Database, my research on CRISPR has been cited in policy documents by institutions including the European Parliament, the Food and Agriculture Organization of the United Nations, and the United Nations Conference on Trade and Development.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Research Data

Research Communities > Community > Research Data

Data Analysis and Big Data

Mathematics and Computing > Statistics > Data Analysis and Big Data

Health Policy

Life Sciences > Health Sciences > Public Health > Health Policy

Econometrics

Humanities and Social Sciences > Economics > Quantitative Economics > Econometrics

From Citation to Courtroom: When Research Evidence Shapes Legal Reasoning

Behind the Paper

Beyond the Impact Factor: Mexico’s Mission-Oriented Leap Toward Health Sovereignty

Behind the Paper

Navigating the Future of Health Research: Evidence for the FP10 Debate

News and Opinion

Beyond the Western Lens: The Realignment of Aging Science

Life in Research

Investigating the heart of science policy

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

How can we measure the evolution of health research? A data-driven approach across funding systems

Share this post

Share with...

...or copy the link