From Simple Labels to Descriptive Sentences in Leukemia Image Classification

In many leukemia classification studies, each image is assigned to only one class such as AML, CML, ALL, or CLL. In this work, we used short descriptive sentences instead of simple labels to give more meaningful information to the AI model.
From Simple Labels to Descriptive Sentences in Leukemia Image Classification
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Explore the Research

Springer International Publishing
Springer International Publishing Springer International Publishing

Advancing WBC Classification: A Hybrid ConvNextV2-Swin Transformer Framework with R3GAN Data Balancing and CLAHE Preprocessing - Journal of Imaging Informatics in Medicine

White blood cell (WBC) classification remains a critical challenge in hematological diagnostics, particularly for rare cell types such as basophils and imbalanced datasets. This study introduces a novel three-component hybrid framework that synergistically integrates: (1) ConvNeXtV2-Swin Transformer for dual-scale hierarchical feature extraction—combining ConvNeXtV2’s depthwise convolutions with Swin Transformer’s shifted window attention to capture both local cellular morphology and global contextual dependencies; (2) R3GAN (Reinforced Reliable Robust Generative Adversarial Network) for intelligent minority class augmentation through reinforcement learning-guided sample generation, effectively mitigating class imbalance while preserving biological fidelity; and (3) CLAHE (Contrast-Limited Adaptive Histogram Equalization) for adaptive preprocessing to normalize imaging variations. Evaluated on the challenging Raabin dataset—characterized by severe class imbalance (301 basophils vs. 8887 neutrophils) and limited diversity—the proposed architecture achieves 99.1% accuracy, surpassing state-of-the-art methods by 2–10%. Notably, the framework demonstrates exceptional data efficiency, maintaining 94% accuracy with only 50% training data. The synergistic integration of architectural innovation, intelligent data synthesis, and adaptive preprocessing establishes a robust paradigm for clinical deployment in resource-constrained environments. Source code is publicly available at https://github.com/momenianmohammad/wbc-convnextv2swin-r3gan-eccgan .

The idea of this paper was very simple. In traditional leukemia classification, each cell image is usually assigned to only one class such as AML, CML, ALL, or CLL. I thought this method is limited because a single label cannot describe the full meaning of a medical image.

So, we tried a different idea. Instead of assigning only one class label, we connected each image to a short descriptive sentence.

For instance:

AML → “Large immature blood cells with abnormal structure.”

CML → “High number of abnormal white blood cells in blood cells.”

ALL → “Fast-growing immature lymphocyte cells in the blood.”

CLL → “Small abnormal lymphocyte cells with slow progression.”

And other similar sentence like that.

In this work, we used both text-to-image and image-to-text learning. The model tried to understand the relation between medical images and written descriptions at the same time. We also used GAN-based methods to help balance the dataset because some classes had fewer images than others.

One challenge in this project was creating meaningful text descriptions that were simple but still useful for the model. Another challenge was the limited and unbalanced medical dataset. Training vision-language models also needed careful tuning and many experiments.

The model performance was evaluated using the metrics discussed in the paper. The final results were around the 70% range. This is not considered perfect or extremely high accuracy. However, the method is more practical and informative than traditional classification systems that only assign simple labels like 0, 1, 2, or 3.

I think the interesting part of this project was trying to make AI understand medical images in a more human-like way using language. This project showed that combining text and image understanding can open new directions in medical AI research.

In the future, this idea can be improved using larger datasets, better text descriptions, and stronger multimodal models. It may also help doctors better understand AI decisions instead of only seeing simple class outputs.

Link of paper: https://link.springer.com/article/10.1007/s44163-026-01239-7

Also links of other researches I contributed:

1- Advanced deep learning framework for automated hematological malignancy classification: Integrating FCMAE V2-WPAT with ACDB-GAN for enhanced leukemia subtype detection

https://doi.org/10.1007/s11760-025-05096-2

2- Constructive learning: A high-performance framework for fetal head circumference estimation

https://doi.org/10.1007/s11227-026-08293-z

3- Multimodal vision-language framework for text-guided leukemia classification using advanced deep learning architectures

https://doi.org/10.1007/s44163-026-01239-7

4- Advancing WBC classification: A hybrid ConvNeXtV2–Swin Transformer framework with R3GAN data balancing and CLAHE preprocessing

https://doi.org/10.1007/s10278-025-01740-y

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Myelopoiesis
Life Sciences > Biological Sciences > Immunology > Haematopoietic System > Myelopoiesis
Machine Learning
Mathematics and Computing > Computer Science > Artificial Intelligence > Machine Learning
Biomedical Engineering and Bioengineering
Technology and Engineering > Biological and Physical Engineering > Biomedical Engineering and Bioengineering
Food Science
Life Sciences > Biological Sciences > Food Science

Related Collections

With Collections, you can get published faster and increase your visibility.

Transforming Education through Artificial Intelligence: Opportunities, Challenges, and Future Directions

Artificial Intelligence (AI) is rapidly changing the educational field by enabling personalized learning, intelligent tutoring systems, automated assessments, learning analytics, and administrative automation.

This collection invites original research, systematic reviews, and visionary perspectives on the transformative impact of AI in education. It aims to explore how AI technologies can enhance equity, inclusion, and efficiency in educational settings across different contexts, including higher education, K-12, vocational training, and lifelong learning. This collection will address technical, pedagogical, ethical, and policy aspects, fostering interdisciplinary perspectives and evidence-based insights.

This Collection supports and amplifies research related to SDG 4 and SDG 9.

Keywords: Artificial Intelligence, AI in Education, Educational Technology, Data Analytics, AI Ethics

Publishing Model: Open Access

Deadline: Nov 30, 2026

Artificial Intelligence in Medical Imaging

This Topical Collection focuses on artificial intelligence (AI) in medical imaging, which aims to highlight recent advancements in the field of medical imaging analysis using AI and big data. Medical imaging is an essential tool for diagnosis, treatment, and monitoring of various medical conditions. However, analyzing medical images can be time-consuming, costly, and prone to human error. With the emergence of AI, many of these challenges can be addressed by automating tasks involved in medical imaging analysis.

We welcome submissions on various topics related to AI in medical imaging, including, but not limited to, novel AI algorithms and techniques for medical image analysis, the integration of AI into clinical workflows, the development of software packages for medical imaging analysis, and the evaluation of AI methods for clinical use. Additionally, we encourage submissions that explore the ethical and social implications of AI in medical imaging, such as the impact on patient privacy, data security, and clinical decision-making.

Overall, this Topical Collection aims to provide a comprehensive overview of the recent advancements in AI in medical imaging and to promote interdisciplinary research and collaborations between AI researchers, medical imaging experts, and clinicians.

Keywords: Clinical Decision Support System; Computer-Aided Diagnosis; Computer Vision; Deep Learning; Diagnostic Imaging; Image Classification; Image Processing; Image Segmentation; Object Detection; Precision Medicine; Radiomics

Publishing Model: Open Access

Deadline: Aug 10, 2026