Behind the Paper

AI in Healthcare: Standardized Reporting for Reproducibility, Validity, and Clinical Impact

A Guide to Medical AI Reporting Guidelines

Published in Social Sciences, Protocols & Methods, and Computational Sciences

Apr 11, 2024

Fiona Kolbinger

Research Assistant Professor, Purdue University

AI in Healthcare: Standardized Reporting for Reproducibility, Validity, and Clinical Impact

Liked by India Ambler and 1 other

Explore the Research

Artificial intelligence (AI) holds transformative potential in healthcare by enabling enhanced disease diagnosis, risk prediction, and treatment optimization. Transparent and comprehensive reporting of AI studies is critical to ensure scientific reproducibility, clinical validity, and public trust. The possible risks of flawed or incomplete reporting include:

Overstated performance due to technical shortcomings like data leakage or unrepresentative datasets
Inability to independently validate findings or assess clinical utility
Challenges developing and updating best practices as the field evolves
Lack of insight into ethical risks, bias, or limitations

The landscape of medical AI reporting guidelines

Our systematic review published in Communications Medicine indicates that there is currently no universal, high-quality reporting standard for studies applying AI to medical data and use cases. Instead, 12 of the 26 reporting guidelines included in our review address specific medical fields (i.e., the CLEAR Guideline regulating Medical Imaging research), and 20 of the 26 reporting guidelines target preclinical or translational research rather than studies prospectively reporting the clinical evaluation of AI-based models. Additionally, the rigor of the development processes ranged from comprehensive multi-stakeholder consensus approaches to more expert-led efforts without clear stakeholder involvement.

Guidelines for early-phase preclinical work more often had narrow subspecialty focuses and were developed without comprehensive consensus procedures compared to the smaller number of clinical trial guidelines. The lack of universal, high-quality guidelines observed in this review may contribute to findings that only a fraction of published AI studies in medicine fully adhere to reporting best practices.

**Figure 1: The landscape of medical AI reporting guidelines.** Comprehensive guidelines are based on a structured, consensus-based, methodical development approach involving multiple experts and relevant stakeholders with details on the exact protocol. Collaborative guidelines are (presumably) developed using a formal consensus procedure involving multiple experts, but provide no details on the exact protocol or methodological structure. Expert-led guidelines are not developed through a consensus-based procedure, do not involve relevant stakeholders, or do not clearly describe the development procedure.

Universal Recommendations Across Guidelines

Despite the heterogeneity, several guideline items were consistently recommended across at least 50% of all guidelines or those developed with rigorous consensus processes. These "universal" components included details like:

Clearly describing the clinical prediction problem and rationale
Specifying the data sources, types, and preprocessing steps
Detailing the type of predictive model and its training/validation
Reporting model performance metrics and interpretation
Discussing limitations, clinical implications, and avenues for real-world translation

These universal items could serve as a baseline for responsible reporting of predictive clinical AI studies in cases where no high-quality guidelines exist for a specific use case.

**Figure 2: Universal components of studies on predictive clinical AI models.** Items recommended by at least 50% of all guidelines or 50% of guidelines with a specified systematic development process were considered universal components of studies on predictive clinical AI models.

Towards Robust, Adaptive Guidelines

As the clinical AI ecosystem rapidly evolves, guidelines must remain dynamic to adequately regulate new data types, model architectures, and use cases. Looking ahead, a challenge in future guideline development will be to balance the need to continuously update guidelines with the resources and time needed to conduct consensus procedures involving key stakeholders. Ensuring high reporting standards through appropriately rigorous, living guidelines will be crucial for maintaining scientific integrity and translating AI's potential into clinically validated tools that improve patient outcomes. Researchers, journals, medical societies, and regulatory bodies all have a role to play in aligning on and enforcing such standards as AI applications progress towards real-world implementation.

Fiona Kolbinger (She/Her)

Research Assistant Professor, Purdue University

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Artificial Intelligence

Mathematics and Computing > Computer Science > Artificial Intelligence

Methodology of Data Collection and Processing

Mathematics and Computing > Statistics > Methodology of Data Collection and Processing

Health, Medicine and Society

Humanities and Social Sciences > Society > Sociology > Health, Medicine and Society

Innovation and Medicine

Humanities and Social Sciences > Society > Sociology > Health, Medicine and Society > Innovation and Medicine

Predictive Medicine

Life Sciences > Biological Sciences > Biological Techniques > Computational and Systems Biology > Predictive Medicine

Literature, Science and Medicine Studies

Humanities and Social Sciences > Literature > Literary Theory > Comparative Literature > Literature and Cultural Studies > Literature, Science and Medicine Studies

Communications Medicine

Communications Medicine

A selective open access journal from Nature Portfolio publishing high-quality research, reviews and commentary across all clinical, translational, and public health research fields.

More about the journal

Related Collections

With Collections, you can get published faster and increase your visibility.

Healthy Aging

This collection welcomes submissions based on studying preclinical models, as well as population-wide and clinical studies. Studies that advance our understanding of mechanisms behind healthy aging are also welcomed. Clinical research of interest will include epidemiological studies, observational studies, longitudinal cohort studies, systematic reviews and clinical trials.

Publishing Model: Open Access

Deadline: Jun 01, 2026

Explore this Collection

Public health and health governance in China

This collection welcomes submissions related to public health and health governance in China. We invite contributions that investigate the dynamics of health economics, analyze health policymaking processes, and explore the implications of public health interventions.

Publishing Model: Open Access

Deadline: Jul 31, 2026

Explore this Collection

Looking for cancer in the bloodstream – one patient at a time

Behind the Paper

The Doorbell Rang at 10:53

Opportunities

Digital Medicine for Infectious Diseases

Events

Reframing Precision Medicine: Innovation to Implementation

Behind the Paper

For Methylation Sequencing, The Cancer Is In The Details

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

AI in Healthcare: Standardized Reporting for Reproducibility, Validity, and Clinical Impact

Share this post

Share with...

...or copy the link