Artificial Intelligence for Direct-To-Physician Reporting of Ambulatory Electrocardiography - Behind the paper

Last week we had drinks to celebrate the end of the Deep Rhythm Artificial Intelligence for autonoMous Analysis of RhyThm INvestigatIon (DRAI MARTINI) study, that is out this week in Nature Medicine, after around 4 years of work
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

The process that led to this paper started with a need to analyse ECG data from another study quickly and reliably, and an idea that this could perhaps be done with artificial intelligence (AI). We thought it could, but we would need help to develop a model for it. And come to think of it, why couldn’t patient data be analysed that way too? 

The analysis of ambulatory ECGs is grueling work. The human heart beats 80,000-100,000 times per day. Scanning and annotating all of those beats to determine whether a patient has a heart rhythm disorder that requires treatment is a highly repetitive and sometimes difficult task, prone to errors. Healthcare workers with special training to do this, ECG technicians, are in short supply, and patients across the world experience problems with lack of access and long waiting times before they get results from their examinations. One can easily understand that for someone who is worried about a rhythm disorder that could lead to fainting, stroke, or worse, this delay to find out can be difficult. In some cases, patients may not receive treatment in time to prevent adverse outcomes.

If we could record and analyse longer ECGs in more patients – something that is hard to do and expensive – we might be able to prevent the consequences of rhythm disorders, in particular strokes due to atrial fibrillation, but ECG technician shortages and cost factors hinder this. Devices with the capability to transmit data continuously are relatively inexpensive. Using AI, the ECG results could be analysed and delivered in near-real time, potentially saving lives when life-threatening arrhythmias are identified. Overall, the case for replacing humans with AI for analysis of ambulatory ECG data is very strong.

Other groups have reported on AI models with accuracy similar to cardiologists for arrhythmia classification, but these models aren’t being used to report directly to physicians. Since cardiologists are generally more skilled than technicians at analysing ECGs, one could reasonably ask why? We thought that none of those papers have shown what would happen if you actually did.

What a physician needs to know when deciding whether to trust a diagnostic AI model is pretty simple: “How likely is it that this report is wrong?” For an ambulatory ECG recording, both the technicians and the AI could make errors in two ways. They could miss a real arrhythmia – a false negative. This could be terrible, because these mistakes may never be discovered. The other way for a model to err is to flag something harmless - a false positive.

When we set out to design the algorithm we decided how we thought a direct-to-physician model should be. False negatives are by far the most important, and these should be minimal. But does it really matter if you find five episodes of a particular arrhythmia, but there are actually seven? We didn’t think so since the patient would receive an appropriate diagnosis and treatment regardless. Does it really matter if it finds something but calls it something else that the physician would need to see? Not really; the patient would be diagnosed in this case too. Too many false positives would be too much of a nuisance, reducing the useability of the model, but the benefit of AI-only reporting is so large that we would tolerate some, because false positives are generally not dangerous – they can be corrected by the cardiologist who receives a report.

The R&D team at MEDICALgorithmics in Warsaw wrote the custom code and trained an AI model that performs this way – DeepRhythmAI. The model has a complex architecture, comprising both wide-context and signal detail components, and convolutional neural networks and transformer networks.

We also needed a different design than previous studies, that could report model performance in absolute numbers, that corresponds to the probability of a false negative in a patient population, not as a proportion of a certain number of ECG strips.

First, this requires a defined population. We used ECG data that had been analysed according to current usual care, by 167 technicians in clinical practice, and did a separate independent analysis by the AI, resulting in >200,000 days of ECG from >14,000 unselected patients that had been referred for monitoring for any reason. Each beat was analysed by both the technicians and the AI.

Second, we needed to invent a design that enabled us to report false negatives for both the AI model and the technicians. To err is human, after all, and technician analyses may not be perfect. So we had to have a gold standard comparator, and for arrhythmia diagnostics this is a consensus panel consisting of three cardiologists. We knew that we needed to recruit a lot of cardiologists to do this, because we intended to ask them to annotate the data beat-to-beat, first independently, and then to agree on every single beat, for a large number of arrhythmia episodes, to make sure that variations in presentation were captured. This takes quite a bit of time, so we asked almost everyone we knew that we thought could do it well, and, to our delight, suprisingly many of them said yes. To facilitate this work we developed a tool that could be used to annotate the data, and tested it extensively, sometimes meeting late into the night.

In order to have an unbiased assessment of both the AI model and the technician work, we selected random ECG strips, and made sure that we had a sufficient number of arrhythmias of each type. To do this, we used a custom written code that searched through each recording and extracted a random event of each arrhythmia type, until a maximum of 500 episodes of each type had been found. These >5,000 strips were presented to the panels. They annotated each ECG strip first on their own, and then resolved any conflict, including discrepancies of single beat annotations, in the panel. These beat-to-beat annotations took them >30 hours of work each, or a total of >1500 hours, a truly impressive effort on their part. Some minor complaints were heard, but most reported having fun, and really interesting discussions. When we compared the cardiologist panels to the AI model and the technician annotations results came out very heavily in favor of using DeepRhythmAI for patient safety – the rate of missed diagnoses was 2.3/1000 patients with AI and 39.4/1000 patients with technician analysis. The AI model was estimated to lead to a moderate increase in false positive findings, 2.6 times more, or one every six 14-day recordings compared to one every 14 recordings for technicians.

In our opinion, the methodology we designed for this study is robust for answering our question of what would happen if we replaced technicians with DeepRhythmAI. We would see huge reductions in missed diagnoses of critical arrhythmias – as many as 17 times fewer. We estimate that this would come at a modest cost of increased false positive findings for physicians to review and reject.

All in all, with DeepRhythmAI, there is a clear path to direct-to-physician reporting of ambulatory ECG results, which we hope will lead to lower costs, more monitoring, faster diagnoses and improved patient outcomes due to reduction of errors.

/Linda S Johnson, Alexandra Måneheim, Alexander Benz and Jeffrey S Healey

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Clinical Medicine
Life Sciences > Health Sciences > Clinical Medicine
Electrocardiography
Life Sciences > Biological Sciences > Biological Techniques > Biophysical Methods > Electrophysiology > Electrocardiography
Artificial Intelligence
Mathematics and Computing > Computer Science > Artificial Intelligence
  • Nature Medicine Nature Medicine

    This journal encompasses original research ranging from new concepts in human biology and disease pathogenesis to new therapeutic modalities and drug development, to all phases of clinical work, as well as innovative technologies aimed at improving human health.