DERM12345: A Large, Multisource Dermatoscopic Skin Lesion Dataset with 40 Subclasses

We introduce a large-scale, richly annotated dermatoscopic image dataset of 12,345 images across 40 skin lesion subclasses, designed to enhance AI-based diagnostics, support clinical insights, and advance research in dermatology.
DERM12345: A Large, Multisource Dermatoscopic Skin Lesion Dataset with 40 Subclasses
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

In this blog post, we share the story behind our recently published paper in Nature Scientific Data, titled “DERM12345: A Large, Multisource Dermatoscopic Skin Lesion Dataset with 40 Subclasses” This post walks through our journey — from the initial idea to the final publication — highlighting the motivations, unexpected turns, and challenges we encountered while building one of the most detailed dermatoscopic datasets to date.

What led us to pursue this study?

In dermatology, diagnosing a skin lesion isn’t just about looking at a single image — it’s about context, experience, and a structured clinical decision-making process. Dermatologists often rely on a hierarchical approach: Is the lesion melanocytic or not? Benign or malignant? Which subtype could it be?

Higher-level classes, such as distinguishing between melanocytic and non-melanocytic lesions or benign versus malignant categories, play a critical role in AI-based skin lesion analysis. They mirror the stepwise decision-making process used by dermatologists and help AI models learn more efficiently by providing a structured learning path. This hierarchical approach not only improves classification accuracy but also supports more interpretable and clinically relevant predictions, making AI tools safer and more reliable in real-world medical settings.

As researchers working at the intersection of clinical dermatology and AI, we saw an opportunity to replicate this clinical reasoning in a machine-learning-compatible format. Existing public datasets were valuable, but they didn’t reflect the way decisions are made in practice. Many lacked subclass annotations or the nuanced distinctions that affect real-world diagnoses.

So, we created our own: a large-scale dataset annotated using a three-level taxonomy tree, carefully designed to support both human learning and AI training. Our goal was simple — to bridge the gap between the way clinicians think and the way algorithms learn.

What surprised us along the way?

We began with a clear taxonomy structure, aiming to standardize annotation across the dataset. But as we dove into labeling, we quickly realized that theory and practice don’t always align. Some categories were too broad, others too specific — and certain distinctions didn’t translate well between clinical and AI perspectives.

This led to a dynamic, iterative process: updating the taxonomy during annotation, balancing clinical accuracy with computational utility. It reminded us how different — and yet complementary — the worlds of medicine and AI can be. What started as a rigid tree became a living structure, shaped by both domain knowledge and dataset realities.

What did we do?

This study was all about data: collecting it, structuring it, and making it meaningful.

We gathered 12,345 high-resolution dermatoscopic images from three dermatology centers in Türkiye, using a mix of professional dermatoscopy systems and mobile-based tools. These images were collected over more than a decade and include a diverse set of lesions across 40 subclasses.

After collection, we focused on preprocessing, standardization, and most importantly — annotation. Two expert dermatologists reviewed the images, using our evolving taxonomy to label each case. All malignant cases were biopsy-confirmed, and benign cases were verified through follow-up records.

To make the dataset more useful for the community, we also included image embeddings from Google’s Derm Foundation Model — providing a ready-to-use feature representation for researchers. And we uploaded the full dataset to the ISIC archive, ensuring long-term open access and visibility.

What are the broader implications?

DERM12345 isn’t just another image dataset — it’s a foundation for building smarter, more clinically aligned AI systems.

By offering a structured hierarchy of skin lesions, the dataset encourages the development of models that reflect the way dermatologists actually work. Hierarchical classification — where models learn not just what a lesion is, but how it relates to broader diagnostic categories — is a promising direction for making AI more interpretable and clinically reliable.

Another key contribution is the representation of the Turkish population, which has historically been underrepresented in open-access dermatological datasets. This makes DERM12345 a valuable tool for evaluating how well existing AI models generalize to this demographic. Researchers can use the dataset to test algorithm performance on data from Türkiye and potentially identify population-specific limitations, biases, or misclassifications — a crucial step toward more inclusive, fair, and globally applicable diagnostic tools.

Additionally, the inclusion of rare and visually similar lesion types provides a robust challenge for model development, while the dataset’s public availability through the ISIC archive supports reproducibility, benchmarking, and ongoing research in dermatology and medical AI.

We hope this work helps move the field toward AI tools that are not just accurate, but trustworthy, explainable, and clinically meaningful.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Dermatology
Life Sciences > Health Sciences > Clinical Medicine > Dermatology
Artificial Intelligence
Mathematics and Computing > Computer Science > Artificial Intelligence
Computer Imaging, Vision, Pattern Recognition and Graphics
Mathematics and Computing > Computer Science > Computer Imaging, Vision, Pattern Recognition and Graphics
Skin Cancer
Life Sciences > Health Sciences > Clinical Medicine > Diseases > Cancers > Skin Cancer

Related Collections

With collections, you can get published faster and increase your visibility.

Clinical informatics

This Scientific Data Collection presents descriptions of a series of datasets for use in clinical informatics fields. Datasets in clinical informatics are vital for improving healthcare quality, efficiency, and patient outcomes.

Publishing Model: Open Access

Deadline: Sep 19, 2025

Text and speech corpora for natural language processing and corpus linguistics

This Collection presents a series of annotated text and speech corpora alongside linguistic models tailored for CL and NLP applications. These resources aim to enrich the arsenals of CL and NLP users and facilitate interdisciplinary research.

Publishing Model: Open Access

Deadline: Jul 24, 2025