Behind the Paper

COMPOSER-LLM: Giving AI the Power to Understand the Story Behind Sepsis and Save Lives

We developed COMPOSER-LLM, an AI system that dives into doctors' notes to spot sepsis earlier, showing how language models can boost life-saving predictions in busy hospitals.

Published in Healthcare & Nursing, Computational Sciences, and General & Internal Medicine

May 17, 2025

Shamim Nemati

Professor, UC San Diego Health

Liked by Shamim Nemati and 3 others

Explore the Research

Why This Research Is a Step Forward

Sepsis is a formidable foe in modern medicine. It’s not a specific disease, but a life-threatening organ dysfunction caused by the body’s haywire response to an infection. Globally, it affects millions and is a leading cause of hospital deaths and staggering healthcare costs. We’ve known for a while that catching sepsis early and starting treatment fast can save lives and improve patient outcomes [1]. Many hospitals already use predictive AI models based on 'structured' data like lab results and vital signs. However, these systems often miss crucial clues hidden in 'unstructured' clinical notes – the detailed narratives written by doctors and nurses. We saw an opportunity here: what if we could teach an AI to understand the nuances in these notes to make sepsis prediction even more accurate? This became the driving question behind our work.

What We Did: Teaming Up AI Brains Amidst Real-World Hurdles

Our journey led us to create COMPOSER-LLM. The core idea was to combine our existing sepsis prediction model, COMPOSER [2], which itself has shown success in reducing sepsis mortality at UC San Diego Health [1], with the power of a Large Language Model (LLM) that excels at understanding human language. Think of it as giving our sepsis detective a new partner who can read between the lines.

The research was primarily computational. However, bringing such a system to life isn't as simple as plugging in an open-source AI framework. While these tools are more accessible than ever, there's often an 'illusion of simplicity'. Building and scaling clinical AI, especially for critical, time-sensitive conditions like sepsis, demands rigorous validation, a robust and secure infrastructure, and seamless integration into existing, often complex, clinical workflows [3-5]. Healthcare data itself is a challenge – it can be fragmented, inconsistent, and incomplete. Our LLM had to be adept at extracting meaningful information from this imperfect data.

A key part of COMPOSER-LLM is its ability to kick in when the initial prediction is a bit fuzzy – what we call 'high-uncertainty' cases. In these situations, the LLM would analyze clinical notes (such as nursing assessments, physician progress notes, radiology reports, etc.), looking for specific signs and symptoms related not only to sepsis but also to conditions that can mimic sepsis (such as cardiogenic shock, cirrhosis, and GI hemorrhage, among others). This enhanced ability to perform differential diagnosis (DDx) significantly improved the model's accuracy to confirm or rule out sepsis.

We put COMPOSER-LLM through its paces with both retrospective data and a prospective "silent mode" deployment, watching how it performed with real-time, sometimes incomplete, clinical notes – a crucial step, as prospective validation is key to ensuring models work in the real world, not just on historical data. This interdisciplinary effort brought together experts in biomedical informatics, emergency medicine, critical care, and more, all focused on refining this tool.

What We Found: A Smarter Sentry Against Sepsis

The results were genuinely exciting! COMPOSER-LLM significantly outperformed the standalone COMPOSER model. On a large dataset of patient encounters, it showed a sensitivity of 72.1% and a positive predictive value (PPV) of 52.9%, with an overall F-1 score of 61.0%. Crucially, it also reduced the number of false alarms.

When we deployed COMPOSER-LLM across emergency departments (EDs) in a real-world setting, the AI model demonstrated strong predictive performance, with a sensitivity of ~70% and PPV of ~58% (as opposed to 10-15% PPV for commercially available models!). Furthermore, when clinicians reviewed cases where COMPOSER-LLM raised an alarm for a patient who didn’t ultimately meet the full Sepsis-3 criteria (a 'false positive' by strict definition), they found that about 62% of these patients did have a suspected or confirmed bacterial infection at the time of the alert; raising the 'effective' model PPV above 80% and reinforcing model's clinical utility. Finally, 83.1% of false positives contained the actual diagnosis in the predicted differential list.

Broader Horizons: The Future of LLMs and the AI Adoption Maze

Our study shows that LLMs hold considerable promise for making clinical prediction tools smarter, especially by tapping into the rich, contextual goldmine of unstructured clinical notes. For conditions like sepsis, where timely intervention is paramount, even a small improvement in prediction accuracy and speed can make a big difference.

In the EDs, AI systems like COMPOSER-LLM counter cognitive biases by integrating a DDx process into predictive modeling. This mitigates 'anchoring bias' by compelling the AI to evaluate a broader range of potential conditions, rather than fixating on an initial suspicion. It also curtails 'automation bias,' as the AI's internal DDx assessment ensures possibilities like sepsis aren't prematurely dismissed, even if an initial alert is uncertain. This evolution of AI from simple binary predictions to delivering more nuanced, DDx-informed alerts marks a breakthrough in healthcare analytics, fostering comprehensive, context-aware clinical decision support and enhancing patient safety.

The COMPOSER-LLM system represents a significant advancement in clinical AI. It demonstrates how Large Language Models can be powerfully augmented with analytical tools, such as differential diagnosis (DDx) calculators, and effectively grounded in real-world clinical data—for instance, by using Retrieval Augmented Generation (RAG) to deeply understand patient notes. This innovative approach, designed for seamless integration with Electronic Health Records (EHRs) via interoperability standards like FHIR, is key to fostering more insightful and context-aware clinical decision support, paving the way for more sophisticated AI assistance in healthcare. One of the key aspects of our design is that the computationally intensive LLM is only called upon in those tricky, high-uncertainty cases, making the system more efficient and cost-effective.

However, the journey from a promising model to a widely adopted clinical tool is fraught with challenges that go far beyond the code. As hospitals and healthcare systems look to leverage AI, they face the classic tension between ‘build in-house’ and ‘buy from a vendor’. Building in-house, as we did with the foundational COMPOSER model (supported by over $10M in NIH grant funding over 5 years), offers immense customization and control [6]. But it requires substantial, sustained investment in expertise, infrastructure, and ongoing maintenance – resources that can be scarce. On the other hand, the allure of "free" or bundled AI tools within larger EHR packages can be misleading, sometimes hiding significant implementation costs in terms of staff time and effort, and may lack the robust validation or post-implementation monitoring needed to ensure they are truly effective and safe. The Epic Sepsis Model experience, for example, highlighted how a seemingly low-cost solution could underperform and consume valuable resources [7].

Ultimately, we believe that by carefully developing and prospectively deploying and validating AI tools like COMPOSER-LLM into clinical practice, and by generating high-quality clinical evidence for life-saving, we can navigate these challenges. This research is a step towards a future where AI works seamlessly alongside healthcare professionals, providing them with deeper insights and more time to focus on what matters most: their patients.

References:

[1] Boussina A, Shashikumar SP, Malhotra A, Owens RL, El-Kareh R, Longhurst CA, Quintero K, Donahue A, Chan TC, Nemati S, Wardi G. Impact of a deep learning sepsis prediction model on quality of care and survival. NPJ digital medicine. 2024 Jan 23;7(1):14.

[2] Shashikumar SP, Wardi G, Malhotra A, Nemati S. Artificial intelligence sepsis prediction algorithm learns to say “I don’t know”. NPJ digital medicine. 2021 Sep 9;4(1):134.

[3] Boussina A, Shashikumar S, Amrollahi F, Pour H, Hogarth M, Nemati S. Development & deployment of a real-time healthcare predictive analytics platform. In2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2023 Jul 24 (pp. 1-4). IEEE.

[4] Wardi G, Owens R, Josef C, Malhotra A, Longhurst C, Nemati S. Bringing the promise of artificial intelligence to critical care: what the experience with sepsis analytics can teach us. Critical care medicine. 2023 Aug 1;51(8):985-91.

[5] Kwong JC, Nickel GC, Wang SC, Kvedar JC. Integrating artificial intelligence into healthcare systems: more than just the algorithm. NPJ Digital Medicine. 2024 Mar 1;7(1):52.

[6] Wardi G, Longhurst CA. Unreasonable effectiveness of training AI models locally. BMJ Quality & Safety. 2025 May 9.

[7] Wong A, Otles E, Donnelly JP, Krumm A, McCullough J, DeTroyer-Cooley O, Pestrue J, Phillips M, Konye J, Penoza C, Ghous M. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA internal medicine. 2021 Aug 1;181(8):1065-70.

Shamim Nemati

Professor, UC San Diego Health

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Llynne Kiernan

6 months ago

Hello,

This is a bit about what I am working on as a Nurse Educator.

I am currently working on a summer research project: Predicting Sepsis Onset in Critical Care Patients: A Machine Learning Approach.

I am currently reviewing some articles on COMPOSER. As a nurse educator, educating nurses and nursing students to improve their knowledge and confidence in recognizing sepsis and the importance of the high alerts from the EHR will improve situational awareness and communication with the healthcare providers. Student nurses need to understand the pathophysiology in the management of sepsis. Nurse educators are responsible for ensuring that the content of sepsis education is current, keeping with best practices, and with the latest evidence-based knowledge of recognizing, reporting, and managing sepsis. Llynne Kiernan, DNP, MSN, RN-BC, CHSE

Follow the Topic

Sepsis

Life Sciences > Biological Sciences > Immunology > Inflammation > Sepsis

Predictive Markers

Life Sciences > Health Sciences > Clinical Medicine > Diagnosis > Biomarkers > Predictive Markers

Artificial Intelligence

Mathematics and Computing > Computer Science > Artificial Intelligence

Machine Learning

Mathematics and Computing > Computer Science > Artificial Intelligence > Machine Learning

Emergency Medicine

Life Sciences > Health Sciences > Clinical Medicine > Emergency Medicine

Natural Language Processing (NLP)

Mathematics and Computing > Computer Science > Artificial Intelligence > Natural Language Processing (NLP)

npj Digital Medicine

npj Digital Medicine

An online open-access journal dedicated to publishing research in all aspects of digital medicine, including the clinical application and implementation of digital and mobile technologies, virtual healthcare, and novel applications of artificial intelligence and informatics.

More about the journal

Related Collections

With Collections, you can get published faster and increase your visibility.

Artificial Intelligence in Emergency and Critical Care Medicine

This Collection focuses on the unique challenges and opportunities for artificial intelligence (AI) applications in the emergency department (ED) and intensive care unit (ICU), environments where rapid decision-making and precision are critical to patient survival. These settings are characterized by their fast pace, high patient turnover, unpredictable workloads, and the need to manage acute and life-threatening conditions.

Publishing Model: Open Access

Deadline: Jan 10, 2026

Explore this Collection

Digital Health Equity and Access

This Collection explores innovations and challenges in advancing digital health equity and access, focusing on diverse populations and inclusive technologies.

Publishing Model: Open Access

Deadline: Mar 03, 2026

Explore this Collection

Latest Content

Opportunities, From the Editors

Call for papers: 6G technologies

Opportunities, From the Editors

Call for papers: Industrial uses and benefits of blockchain applications

Expected effects of a global transformation of agricultural pest management

Opportunities, From the Editors

Call for papers: Pyrometallurgy Collection

From the Editors

Introducing: Social Science Matters

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

COMPOSER-LLM: Giving AI the Power to Understand the Story Behind Sepsis and Save Lives

Share this post

Share with...

...or copy the link