Integrating Clinical Guidelines and AI to Improve Respiratory Failure Management

Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Explore the Research

SpringerLink
SpringerLink SpringerLink

Enhancing predictive modeling for respiratory support with LLM-driven guideline adherence - Critical Care

Background Optimal respiratory support selection between high-flow nasal cannula (HFNC) and noninvasive ventilation (NIV) for intensive care units (ICU) patients at risk of invasive mechanical ventilation (IMV) remains unclear, particularly in cases not represented in prior clinical trials. We previously developed RepFlow-CFR, a deep counterfactual model estimating individualized treatment effects (ITE) of HFNC versus NIV. However, interpretability and guideline alignment remain challenges for clinical adoption. This study describes the development and integration of a clinical guideline-driven LLM to enhance deep counterfactual model recommendations for NIV versus HFNC in patients at high-risk for invasive mechanical ventilation. Methods We enhanced RepFlow-CFR by incorporating a large language model (LLM, Claude 3.5 Sonnet) to enforce clinical guideline adherence and generate explainable treatment recommendations. The LLM was configured in a HIPAA-compliant AWS environment and prompted using structured patient data, clinical notes, and formal guideline criteria. Recommendations from RepFlow-CFR and LLM were compared to actual treatment decisions to assess concordance. We evaluated IMV and mortality/hospice rates across concordant and discordant groups. Additionally, we conducted a structured chart review of 20 cases to assess the clinical validity and safety of LLM-driven recommendations. Results Among 1,261 ICU encounters, treatments concordant with LLM-enhanced recommendations were associated with lower IMV rates. For the HFNC recommendation, IMV occurred in 46/188 (24.47%) when care was concordant versus 9/17(52.94%) when discordant, corresponding to a 97.33% relative risk increase when discordant. Concordance was also associated with reduced mortality or hospice discharge (odds ratio 0.670, p = 0.046). In a 20-case chart review, 19/20 (95%) LLM recommendations aligned with clinical guidelines and physicians agreed with 13/20 (65%) final recommendations. Errors were noted in 11/20 cases, most rated low or moderate risk; 2/20 were judged as potentially causing severe harm. Conclusions Integrating LLMs for guideline enforcement improves the interpretability and clinical alignment of counterfactual models in respiratory support decision-making. This hybrid framework not only enhances concordance with real-world practice but may also improve patient outcomes. Future work will refine contraindication detection and expand validation to prospective clinical trials.

Why we did this study:

Acute respiratory failure is a common condition among hospitalized patients and a leading reason for ICU admission. As respiratory failure progresses, there is a continuum of respiratory support modalities administered by clinicians.  Randomized controlled trials have shown that in specific populations, the type of respiratory support or oxygen modality, high flow nasal cannula (HFNC) vs non-invasive ventilation (NIV), may reduce progression to intubation and invasive mechanical ventilation (IMV) 1. However, prior trials used narrow inclusion criteria, limiting generalizability and failing to capture individualized treatment effects. In real-world practice, patients are often complex, and the decision for high flow nasal cannula versus NIV remains is often nuanced.

Our group previously developed an algorithm (RepFlow-CFR) that estimates patient specific outcomes under HFNC and NIV, adjusting for confounders, to predict which respiratory support modality is likely to reduce an individual patient’s risk of progressing to IMV2-4.  Although such tools could support clinical decision making, barriers to implementation still exist. One important barrier is clinician acceptance and trust. Model outputs often appear to function like a “black box”, which makes interpretation difficult. Because clinical practice guidelines continue to shape physician decision-making, any discordance between algorithmic recommendations and guideline-based standards of care prevents adoption.

What we did:

To address this challenge, we integrated a large language model (LLM) into our decision pipeline as a guideline-aware, explainable reasoning layer.  Using clinician notes and clinical data, we designed the LLM to interpret and refine the RepFlow-CFR predictions through the lens of established, physician-accepted practice guidelines.

We provided the LLM with summarized indications and contraindications for HFNC and NIV based off of the ERS and ATS guidelines5,6, the RepFLow-CFR model outputs, and relevant patient data and clinical notes.  The LLM then generated a guideline-aligned, explainable recommendation, favoring HFNC, NIV or neither.

We then evaluated concordance between 1) The RepFlow-CFR recommendation and the patient’s actual treatment, 2) The LLM recommendation and the patient’s actual treatment. We then assessed outcomes when recommendations were concordant versus discordant. As a secondary goal we also had a subset of the LLM recommendations reviewed by 3 critical care expert physicians. Physicians assessed each LLM recommendation for accuracy, potential harm, clinical judgement, reasoning and comprehension.

What we found:

Patients whose actual treatment aligned with the deep counterfactual model and the LLM recommendations had better outcomes. In contrast, when the treatment deviated from the model- recommended modality, the likelihood of progression to IMV was significantly higher.

Chart review showed that the LLM’s treatment recommendations were largely consistent with clinical guidelines. However, several cases revealed incorrect reasoning or data retrieval errors in multiple instances.  Agreement among the reviewing physicians with the LLM’s final recommendation ranged from 65-85%.

 Why this is important:

This study demonstrates an approach that pairs a deep counterfactual inference model with a guideline-aware LLM to generate individualized patient recommendations for HFNC vs NIV in patients with acute respiratory failure. This framework addresses a key barrier to clinical AI implementation: establishing physician trust by ensuring adherence to accepted standards of care through guideline integration. Incorporating a guideline-constrained, explainable layer enhances transparency, safety, and alignment with clinical practice. 

Ultimately, this approach advances the responsible integration of AI into healthcare by showing how deep learning models can be paired with guideline-based reasoning systems to support clinician decision-making in a safe and interpretable way.

 Literature Cited

1               Frat, J. P. et al. High-flow oxygen through nasal cannula in acute hypoxemic respiratory failure. N Engl J Med 372, 2185-2196 (2015). https://doi.org/10.1056/NEJMoa1503326

2               Lam, J. Y. et al. Development, deployment, and continuous monitoring of a machine learning model to predict respiratory failure in critically ill patients. JAMIA Open 7, ooae141 (2024). https://doi.org/10.1093/jamiaopen/ooae141

3               Shashikumar, S. P. et al. Development and Prospective Validation of a Deep Learning Algorithm for Predicting Need for Mechanical Ventilation. Chest (2020). https://doi.org/10.1016/j.chest.2020.12.009

4               Lu, X. et al. Comparing High-Flow Nasal Cannula and Non-Invasive Ventilation in Critical Care: Insights from Deep Counterfactual Inference. Res Sq (2025). https://doi.org/10.21203/rs.3.rs-7230866/v1

5               Rochwerg, B. et al. Official ERS/ATS clinical practice guidelines: noninvasive ventilation for acute respiratory failure. Eur Respir J 50 (2017). https://doi.org/10.1183/13993003.02426-2016

6               Oczkowski, S. et al. ERS clinical practice guidelines: high-flow nasal cannula in acute respiratory failure. Eur Respir J 59 (2022). https://doi.org/10.1183/13993003.01574-2021

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Intensive Care Medicine
Life Sciences > Health Sciences > Clinical Medicine > Intensive Care Medicine
Health Informatics
Mathematics and Computing > Computer Science > Computer and Information Systems Applications > Health Informatics

Related Collections

With Collections, you can get published faster and increase your visibility.

Extracorporeal Blood Purification

Extracorporeal blood purification (EBP) plays a critical and evolving role in intensive care medicine, especially in the management of critically ill patients with systemic inflammation, sepsis, multi-organ failure, renal dysfunction or intoxications. Multiple different technologies exist, including renal replacement therapy, hemoadsorption, plasma exchange / plasmapheresis and liver support systems. Their primary purpose is to support or replace failing organ systems by removing harmful substances from the blood using external devices.

This collection in Critical Care journal focusses on all aspects related to EBP in critically ill patients, including but not limited to:

• Renal replacement therapy,

• Role of therapeutic plasma exchange,

• EBP in sepsis,

• Monitoring during EBP and combination therapy.

All submissions in this collection undergo the journal’s standard peer review process. Similarly, all manuscripts authored by a Guest Editor(s) will be handled by the Editor-in-Chief. As an open access publication, this journal levies an article processing fee (details here). We recognize that many key stakeholders may not have access to such resources and are committed to supporting participation in this issue wherever resources are a barrier. For more information about what support may be available, please visit OA funding and support, or email OAfundingpolicy@springernature.com or the Editor-in-Chief.

Publishing Model: Open Access

Deadline: Jun 25, 2026

The future of Intensive Care Medicine

This new Thematic Series focus on all aspects related to the future of Intensive Care Medicine, including but not limited to: organization, staffing, monitoring systems, new therapies, future organ support systems, ethical aspects.

Relevant papers can be submitted via the Springer Nature Article Processing Platform (Snapp) by selecting the article category “Thematic Series” and the T.S. title "The future of Intensive Care".

Please note: all submissions will be evaluated for relevance to the “The future of Intensive Care” theme and will also be subject to peer review.

Instructions to authors are available here.

Access Thematic Series contents here

Publishing Model: Open Access

Deadline: Ongoing