Skin Tone Analysis for Representation in Educational Materials (STAR-ED)

Introducing an open-source AI tool that automatically, scalably, and accurately reports the distribution of skin types in a digital file of a medical textbook to flag inequities in the training of healthcare professionals.

In medicine, the term “zebra” is used to refer to a rare disease because doctors are taught to assume common diagnoses, “horses”, rather than more exotic ones. Because of such training, patients often suffer through long periods without proper treatment while remaining undiagnosed for the rare disease they have. But this West-centric analogy is completely backwards in Kenya, where some of us live. Zebras are common and horses are rare!


What does this have to do with our paper Skin Tone Analysis for Representation in Educational Materials (STAR-ED) Using Machine Learning appearing today in npj Digital Medicine? Everything in fact.

Medical students and doctors around the world are taught about skin diseases using textbooks, presentation slides, and other educational materials that show manifestations of the diseases on light-colored skin. Not every skin disease appears the same on light skin as it does on dark skin; for example, basal cell carcinoma is pink and pearly on light skin while it can be pigmented and shiny on dark skin. Psoriasis presents as salmon-colored plaques on light skin, but can be brown or purple on dark skin.  If medical education does not represent all skin tones, doctors trained using these materials are less likely to properly diagnose and treat a patient with dark skin who comes to them with a skin condition.  In fact, with skin cancer, this often leads to diagnostic delays, meaning that patients are diagnosed at a more advanced and harder to treat stage of disease. Even though there are many more people with dark skin than light skin globally, the choices of textbook authors impact how physicians recognize skin disease across diverse skin tones.

In our work, we address this source of inequity by developing and open-sourcing an artificial intelligence (AI) tool that automatically, scalably, and accurately reports the distribution of skin types in a digital file (such as a pdf) of a medical textbook. The tool operates in a fully end-to-end manner without any manual effort involved. A low ratio of images portraying dark skin can be flagged by the tool. We envision professional societies and licensing boards setting a minimum standard on that ratio for authors and publishers to adhere to, thereby helping improve health equity.

Putting STAR-ED together required several different components and skillsets from the authorship team. The dermatologists on the team identified the problem and manually labeled innumerable figures from textbooks to serve as a baseline for and training data for the AI. The AI scientists enhanced a corpus conversion service to pick out the figures from the full document and improved upon machine learning models they had previously created for finding non-diseased patches of skin in images and estimating their tone.

Block diagram of STAR-ED methodology.

This work is a great example of AI serving not to replace human doctors but serving to improve their training and making them more effective. The STAR-ED strategy can be applied to understand the representation of other demographic sub-populations in the academic materials of different fields within and outside medicine. Non-imagery information sources, such as body text, captions, and tables can also be used for further representation analysis.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Life Sciences > Health Sciences > Clinical Medicine > Dermatology
Artificial Intelligence
Mathematics and Computing > Computer Science > Artificial Intelligence
Social Justice
Humanities and Social Sciences > Society > Sociology > Social Justice
Health Care
Life Sciences > Health Sciences > Health Care
  • npj Digital Medicine npj Digital Medicine

    An online open-access journal dedicated to publishing research in all aspects of digital medicine, including the clinical application and implementation of digital and mobile technologies, virtual healthcare, and novel applications of artificial intelligence and informatics.

Related Collections

With collections, you can get published faster and increase your visibility.

Harnessing digital health technologies to tackle climate change and promote human health

This collection invites research on the use of digital health technologies that innovate solutions to improve sustainable health care practice and delivery.

Publishing Model: Open Access

Deadline: Apr 30, 2024

Clinical applications of AI in mental health care

This joint venture Collection between npj Mental Health Research and npj Digital Medicine highlights how AI can be safely, ethically, & impactfully utilized to advance our understanding of mental illnesses & improve patient care.

Publishing Model: Open Access

Deadline: Jun 22, 2024