Behind the Paper

From Simple Labels to Descriptive Sentences in Leukemia Image Classification

In many leukemia classification studies, each image is assigned to only one class such as AML, CML, ALL, or CLL. In this work, we used short descriptive sentences instead of simple labels to give more meaningful information to the AI model.

The idea of this paper was very simple. In traditional leukemia classification, each cell image is usually assigned to only one class such as AML, CML, ALL, or CLL. I thought this method is limited because a single label cannot describe the full meaning of a medical image.

So, we tried a different idea. Instead of assigning only one class label, we connected each image to a short descriptive sentence.

For instance:

AML → “Large immature blood cells with abnormal structure.”

CML → “High number of abnormal white blood cells in blood cells.”

ALL → “Fast-growing immature lymphocyte cells in the blood.”

CLL → “Small abnormal lymphocyte cells with slow progression.”

And other similar sentence like that.

In this work, we used both text-to-image and image-to-text learning. The model tried to understand the relation between medical images and written descriptions at the same time. We also used GAN-based methods to help balance the dataset because some classes had fewer images than others.

One challenge in this project was creating meaningful text descriptions that were simple but still useful for the model. Another challenge was the limited and unbalanced medical dataset. Training vision-language models also needed careful tuning and many experiments.

The model performance was evaluated using the metrics discussed in the paper. The final results were around the 70% range. This is not considered perfect or extremely high accuracy. However, the method is more practical and informative than traditional classification systems that only assign simple labels like 0, 1, 2, or 3.

I think the interesting part of this project was trying to make AI understand medical images in a more human-like way using language. This project showed that combining text and image understanding can open new directions in medical AI research.

In the future, this idea can be improved using larger datasets, better text descriptions, and stronger multimodal models. It may also help doctors better understand AI decisions instead of only seeing simple class outputs.

Link of paper: https://link.springer.com/article/10.1007/s44163-026-01239-7

Also links of other researches I contributed:

1- Advanced deep learning framework for automated hematological malignancy classification: Integrating FCMAE V2-WPAT with ACDB-GAN for enhanced leukemia subtype detection

https://doi.org/10.1007/s11760-025-05096-2

2- Constructive learning: A high-performance framework for fetal head circumference estimation

https://doi.org/10.1007/s11227-026-08293-z

3- Multimodal vision-language framework for text-guided leukemia classification using advanced deep learning architectures

https://doi.org/10.1007/s44163-026-01239-7

4- Advancing WBC classification: A hybrid ConvNeXtV2–Swin Transformer framework with R3GAN data balancing and CLAHE preprocessing

https://doi.org/10.1007/s10278-025-01740-y