Cutaneous melanoma presents a profound healthcare challenge, particularly for individuals with darker skin tones, where late diagnosis significantly increases mortality rates. Despite remarkable advancements in artificial intelligence for medical diagnostics, current dermatological image classification systems suffer from a critical ethical and methodological limitation: severe underrepresentation of diverse skin tones in training datasets. This research uses MultiExCam, our novel multi-approach explainable architecture, to quantitatively demonstrate the systemic bias in melanoma detection across different skin tones. Our contributions are threefold: first, we comprehensively analyze major dermatological image repositories, documenting the severe underrepresentation of Fitzpatrick skin types V-VI across all datasets examined; second, we introduce Pipsqueak, a meticulously curated dataset of melanocytic lesions in darker skin tones, which demonstrates the profound scarcity of diverse representation in existing resources; and third, through empirical validation, we quantify performance disparities that emerge when models trained predominantly on light skin images are applied to darker skin tones, revealing accuracy drops that could translate to potentially fatal clinical consequences. This work provides crucial evidence for the urgent need to develop more inclusive diagnostic technologies that can effectively serve all populations, regardless of skin tone, and challenges the field to prioritize deliberate collection of diverse dermatological data.

Bias in Dermatological Datasets: A Critical Analysis of the Underrepresentation of Dark Skin Tones in Melanoma Classification Images

Zumpano, Ester;Vocaturo, Eugenio;Caroprese, Luciano;
2025

Abstract

Cutaneous melanoma presents a profound healthcare challenge, particularly for individuals with darker skin tones, where late diagnosis significantly increases mortality rates. Despite remarkable advancements in artificial intelligence for medical diagnostics, current dermatological image classification systems suffer from a critical ethical and methodological limitation: severe underrepresentation of diverse skin tones in training datasets. This research uses MultiExCam, our novel multi-approach explainable architecture, to quantitatively demonstrate the systemic bias in melanoma detection across different skin tones. Our contributions are threefold: first, we comprehensively analyze major dermatological image repositories, documenting the severe underrepresentation of Fitzpatrick skin types V-VI across all datasets examined; second, we introduce Pipsqueak, a meticulously curated dataset of melanocytic lesions in darker skin tones, which demonstrates the profound scarcity of diverse representation in existing resources; and third, through empirical validation, we quantify performance disparities that emerge when models trained predominantly on light skin images are applied to darker skin tones, revealing accuracy drops that could translate to potentially fatal clinical consequences. This work provides crucial evidence for the urgent need to develop more inclusive diagnostic technologies that can effectively serve all populations, regardless of skin tone, and challenges the field to prioritize deliberate collection of diverse dermatological data.
2025
Istituto di Nanotecnologia - NANOTEC - Sede Secondaria Rende (CS)
9783031975530
9783031975547
Color Perception; Melanoma; Race and Ethnicity; Skin models; Skin cancer; Vitiligo;
File in questo prodotto:
File Dimensione Formato  
ICCS2025_Singapore_Melanoma_Pipsqueak.pdf

solo utenti autorizzati

Licenza: Dominio pubblico
Dimensione 399.88 kB
Formato Adobe PDF
399.88 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/554378
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact