Med-BERT: Pre-trained Embedding for Structured EHR

Behind the paper: Rasmy et al: Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction

Published in Healthcare & Nursing

Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Frankly, structured electronic health records (EHRs) are not the favorite data modality for deep learning. At least not yet. 

One of the reasons is that, unlike images and natural languages where large training data are freely available, very large collection of EHRs are not accessible for many. This will hamper the performance of predictive modeling for individual hospitals, who often only can access their own samples, which are small for deep learning standard.

Our work addresses this issue by adapting the popular NLP framework, BERT, for EHRs. The BERT is a masked autoencoder transformer model that is pre-trained using a very large data set. While the pre-training is difficult and expensive, the pre-trained model can be fine-tuned to deliver state-of-the-art results for a variety of natural language tasks, with even a smaller data set and relatively easy and cheap computation. However, the BERT pre-training-fine-tuning methodology has not been convincingly demonstrated at structured EHR data yet.

In this work, we developed Med-BERT, BERT model for structured EHR data. We pre-trained a 17 million parameter (not large for NLP standard, but quite big for structured EHR) transformer model using a 28 million patient data set. We showed Med-BERT substantially improves the prediction accuracy, boosting the area under the receiver operating characteristics curve (AUC) by 1.21-6.14% for the tasks we tested. In particular, pre-trained Med-BERT boosts performances for small fine-tuning training sets, equivalent to boosting the training set for about 10 times.

Predicting risk of heart failure in diabetes patients in a Cerner data set.
Predicting risk of heart failure in diabetes patients in a Cerner data set.

Overall Med-BERT proves the concept that the BERT methodology is applicable to structured EHR data. Sharing of pre-trained Med-BERT models will benefit disease-prediction studies with small local training datasets, reduce data collection expenses, and accelerate the pace of artificial intelligence aided healthcare. Our work is available at doi: 10.1038/s41746-021-00455-y. 

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Health Care
Life Sciences > Health Sciences > Health Care
  • npj Digital Medicine npj Digital Medicine

    An online open-access journal dedicated to publishing research in all aspects of digital medicine, including the clinical application and implementation of digital and mobile technologies, virtual healthcare, and novel applications of artificial intelligence and informatics.

Related Collections

With Collections, you can get published faster and increase your visibility.

Artificial Intelligence in Emergency and Critical Care Medicine

This Collection focuses on the unique challenges and opportunities for artificial intelligence (AI) applications in the emergency department (ED) and intensive care unit (ICU), environments where rapid decision-making and precision are critical to patient survival. These settings are characterized by their fast pace, high patient turnover, unpredictable workloads, and the need to manage acute and life-threatening conditions.

Publishing Model: Open Access

Deadline: Jan 10, 2026

Evaluating the Real-World Clinical Performance of AI

This Collection invites research on exploring how AI performs in real-world clinical settings, focusing on utility, safety, equity, and impact on healthcare.

Publishing Model: Open Access

Deadline: Jun 03, 2026