A generalizable deep learning regression model for automated glaucoma screening from fundus images

G-RISK: Generalizable glaucoma detection for real-world impact
Published in Healthcare & Nursing
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Detecting glaucoma, a leading cause of irreversible blindness, has been a significant challenge for classification models trained on fundus images. While many AI models demonstrate impressive performance on internal test sets, they often struggle to generalize when applied to external datasets. The inability to adapt to varying glaucoma prevalence, differences in fundus camera technology, and discrepancies in glaucoma ground truth contribute to this performance drop. In our study, we addressed these limitations and confirmed the effectiveness of a regression network for glaucoma referral (G-RISK) in diverse and challenging settings.

Our deep learning approach shifted from binary labels (presence of glaucoma or not) to a regression-based framework (using cup-disc ratio values estimated by clinicians during ophthalmoscopy), enabling the model to learn and predict the severity of glaucoma-induced damage on a continuous scale. By incorporating this continuous severity score, our model provides a more granular assessment of the extent of damage and the progression of the disease. 

To test the generalizability, or ability to generalize to new data, we bundled thirteen different data sources of labeled fundus images. These sources included two large population cohorts, namely the Australian Blue Mountains Eye Study (BMES) and the German Gutenberg Health Study (GHS), as well as eleven publicly available datasets (AIROGS, ORIGA, REFUGE1, LAG, ODIR, REFUGE2, GAMMA, RIM-ONEr3, RIM-ONE DL, ACRIMA, PAPILA). To ensure consistency in the input data, and to minimize the data shift, we developed a standardized image processing strategy that generated 30° disc-centered images from the original data (see inserted illustration). Our model testing encompassed a total of 149,455 images.

The results were highly promising, demonstrating the excellent generalizability of the glaucoma risk regression model. The area under the receiver operating characteristic curve (AUC) for the BMES and GHS population cohorts were 0.976 [95% CI: 0.967-0.986] and 0.984 [95% CI: 0.980-0.991] on the participant level, respectively. At a fixed specificity of 95%, the sensitivities were 87.3% and 90.3%, surpassing the minimum criteria of 85% sensitivity recommended by Prevent Blindness America.

Moreover, the AUC values achieved on the eleven publicly available datasets ranged from 0.854 to 0.988, further highlighting the robustness and effectiveness of our model across different data sources. These results have significant implications for real-world glaucoma detection, as they confirm that a glaucoma risk regression model trained on homogeneous data from a single tertiary referral center can be successfully applied to a wide range of scenarios.

Implementing our model in (resource-limited) real-world settings would have several potential benefits. Firstly, it enables (early) detection and referral of individuals at risk of glaucoma, allowing for timely interventions and potentially preventing vision loss. Secondly, it empowers healthcare providers with a reliable screening tool, facilitating targeted interventions and optimizing resource allocation. Finally, the model's compatibility with standard fundus image processing techniques ensures ease of integration into existing healthcare systems, eliminating the need for significant infrastructure investments.

By surpassing the minimum criteria recommended by Prevent Blindness America, this model has the potential to improve patient outcomes and reduce the burden of glaucoma-related vision loss. However, further validation through prospective cohort studies is warranted to ensure the model's reliability and assess its performance in real-world settings, be it in the clinic or in the general population.

In conclusion, our study demonstrates the exceptional generalizability of the glaucoma risk regression model, which offers a promising solution for accurate glaucoma detection. By leveraging diverse datasets and addressing the challenges of data shifts, our research has the potential to make a tangible impact on real-life glaucoma diagnosis, benefiting both patients and healthcare providers.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Health Care
Life Sciences > Health Sciences > Health Care
  • npj Digital Medicine npj Digital Medicine

    An online open-access journal dedicated to publishing research in all aspects of digital medicine, including the clinical application and implementation of digital and mobile technologies, virtual healthcare, and novel applications of artificial intelligence and informatics.

Related Collections

With collections, you can get published faster and increase your visibility.

Digital Health Equity and Access

This Collection explores innovations and challenges in advancing digital health equity and access, focusing on diverse populations and inclusive technologies.

Publishing Model: Open Access

Deadline: Sep 03, 2025

Effective Trialing of Digital Interventions

This collection focuses on Systematic assessment of digital medical interventions to identify challenges in targeted outcomes for designing robust studies for clinical researchers.

Publishing Model: Open Access

Deadline: Aug 15, 2025