Behind the Paper

Toward trustworthy medical AI via leveraging foundation models

Published in Healthcare & Nursing, Biomedical Research, and General & Internal Medicine

May 10, 2024

Chanwoo Kim

Graduate Research Assistant, University of Washington

Liked by India Ambler

Explore the Research

In recent years, the field of medical artificial intelligence (AI) has undergone rapid advancements, demonstrating capabilities that once seemed unimaginable^1-4. Particularly, AI models in dermatology can now analyze photos of skin lesions—easily taken with smartphone cameras—and determine whether the skin lesions are melanoma or not with high accuracy^5-8. With the appropriate validation through clinical trials, these models have the potential to help triage patients, alleviate physician’s workloads, and expand access to care.

Yet, the path toward widespread adoption of medical AI in clinical settings faces a significant obstacle: the opaque “black-box” nature of these AI models. These models provide diagnoses without explaining the rationale behind their decisions. For medical AI to be safely deployed in clinical settings, it is crucial that it goes beyond accurate predictions. It must unveil the ‘why’ and ‘how’ behind the models’ decisions, ideally offering explanations in terms that are comprehensible to medical professionals⁹. Unfortunately, current explainable AI techniques, such as saliency maps, primarily focus on identifying important features for the model’s prediction, such as input pixels or particular regions in the image. We need a fundamentally different approach to convert image pixels into semantically meaningful, clinically relevant “concepts,” such as “darker pigmentation” and “asymmetric,” for a melanoma detecting AI model. However, achieving this level of transparency requires medical datasets with rich annotations of these medical concepts, which are very hard to obtain¹⁰.

MONET overview — Overview of MONET framework. (a) We develop the MONET model using a vast amount of medical data. (b) Automatic concept annotation by MONET enables (c) data auditing, (d) model auditing, and (e) developing inherently interpretable models.

To address this challenge, we have turned to the latest advancements in AI. “Foundation models” (i.e., AI models trained on a vast dataset so as to be equipped with versatile abilities) have shown remarkable capabilities in recognizing and annotating human-understandable concepts automatically. To develop a foundation model for dermatology, we leveraged the collective knowledge of the medical community, as encapsulated in publicly available medical literature and medical textbooks. The foundation model we developed, MONET (Medical cONcept rETriever) is capable of richly annotating medical images with semantically meaningful medical concepts. We showed that by leveraging MONET’s ability to automatically annotate concepts, the transparency of the medical AI development pipeline can be significantly improved at all stages—be it auditing large-scale training data, scrutinizing models, or monitoring them post-deployment as the errors and biases in data and model can be explained in human-understandable terms. For instance, we used MONET to audit the ISIC dataset, the largest collection of dermatology images, which includes over 70k dermoscopic images commonly used in training dermatology AI models. Our auditing revealed differences between data sources within the ISIC dataset in how concepts correlate with benign or malignant categories. This insight is crucial for understanding which factors affect the transferability of medical AI models across different sites. Usually, such data auditing at scale is not feasible due to the lack of concept labels.

Our approach is unique in that we utilize foundation models as a means to enhance the trustworthiness of traditional medical AI models rather than using them directly for diagnostic tasks. While traditional medical AI devices based on supervised learning are relatively well-established and regulated under FDA guidelines, medical AI devices based on foundation models still have significant developmental progress to make. Given this, we leverage the new capabilities of foundation models to enhance the utility of the existing AI models. Our approach enables us to inspect these AI models through the lens of medically relevant concepts, thereby facilitating their trustworthy deployment in clinical settings. Also, it is important to note that the approach we proposed is universally applicable across medical tasks. The success we've achieved in dermatology serves as a blueprint for potential applications in radiology, ophthalmology, and beyond. Our research marks a pivotal step towards a future where AI and medical professionals work together in a symbiotic relationship built on trust, driving healthcare innovation forward.

Reference

Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619, 357–362 (2023).
Dorr, D. A., Adams, L. & Embí, P. Harnessing the Promise of Artificial Intelligence Responsibly. JAMA 329, 1347–1348 (2023).
Rajpurkar, P. & Lungren, M. P. The Current and Future State of AI Interpretation of Medical Images. N. Engl. J. Med. 388, 1981–1990 (2023).
Jones, O. T. et al. Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review. Lancet Digit. Health 4, e466–e476 (2022).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
Vodrahalli, K. et al. TrueImage: A Machine Learning Algorithm to Improve the Quality of Telehealth Photos. in Biocomputing 2021 220–231 (WORLD SCIENTIFIC, 2020). doi:10.1142/9789811232701_0021.
Omiye, J. A., Gui, H., Daneshjou, R., Cai, Z. R. & Muralidharan, V. Principles, applications, and future of artificial intelligence in dermatology. Front. Med. 10, (2023).
DeGrave, A. J., Cai, Z. R., Janizek, J. D., Daneshjou, R. & Lee, S.-I. Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians. Nat. Biomed. Eng. (2023) doi:10.1038/s41551-023-01160-9.
Daneshjou, R., Yuksekgonul, M., Cai, Z. R., Novoa, R. A. & Zou, J. SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained debugging and analysis. in (2022).

Chanwoo Kim

Graduate Research Assistant, University of Washington

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Health Care

Life Sciences > Health Sciences > Health Care

Clinical Medicine

Life Sciences > Health Sciences > Clinical Medicine

Medical Imaging

Life Sciences > Health Sciences > Health Care > Medical Physics > Medical Imaging

Biomedical Research

Life Sciences > Health Sciences > Biomedical Research

Nature Medicine

Nature Medicine

This journal encompasses original research ranging from new concepts in human biology and disease pathogenesis to new therapeutic modalities and drug development, to all phases of clinical work, as well as innovative technologies aimed at improving human health.

More about the journal

Your space to connect: The Primary immunodeficiency disorders Hub

A new Communities’ space to connect, collaborate, and explore research on Clinical Medicine, Immunology, and Diseases!

Related Collections

With Collections, you can get published faster and increase your visibility.

Stem cell-derived therapies

This cross-journal Collection welcomes submissions that explore stem cell biology, their therapeutic potential, and the use of stem cells and stem cell-derived products to treat human disease.

Publishing Model: Hybrid

Deadline: Mar 26, 2026

Explore this Collection

Building More Resilient Teams: A Mathematical Approach Using Hypergraphs

Behind the Paper

Crystallographic Engineering Enables Fast Low‑Temperature Ion Transport of TiNb2O7 for Cold‑Region Lithium‑Ion Batteries

Behind the Paper

Multiscale Theoretical Calculations Empower Robust Electric Double Layer Toward Highly Reversible Zinc Anode

Behind the Paper

Patient Experience at Discharge: A Quality Improvement Study

Behind the Paper

Management Over Afforestation: The Little-Known Story Behind China’s Surge in Forest Carbon Sinks

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

Toward trustworthy medical AI via leveraging foundation models

Share this post

Share with...

...or copy the link