Advanced artificial intelligence models for skin lesion classification often suffer from performance disparities when applied to images of patients with darker skin tones, largely due to underrepresentation of dark skin tone images in training datasets. In this study, we investigate this issue by evaluating a previously proposed explainable framework, MultiExCAM, trained on the widely used ISIC2018 dataset. We test its performance on Pipsqueak, a previously proposed dataset composed by skin lesion images on dark skin tones. As expected, we observe a significant drop in classification performance when the model is applied to Pipsqueak. To better understand the source of these failures, we employ explainable artificial intelligence techniques to visualize and analyze the model’s decision-making process on both datasets. Our results highlight clear differences in attention patterns and decision rationale, revealing how the lack of dark skin tone representation in the training data leads to poor generalization and biased behavior. This work emphasizes the critical role of explainable analysis in exposing and understanding model bias in clinical applications, and the necessity of inclusive datasets for fair and reliable skin lesion classification.

Underrepresentation of Dark Skin Tone in Skin Lesion Datasets: The Role of the Explainable Techniques in Assessing the Bias

Zumpano, Ester;Vocaturo, Eugenio;Caroprese, Luciano
2025

Abstract

Advanced artificial intelligence models for skin lesion classification often suffer from performance disparities when applied to images of patients with darker skin tones, largely due to underrepresentation of dark skin tone images in training datasets. In this study, we investigate this issue by evaluating a previously proposed explainable framework, MultiExCAM, trained on the widely used ISIC2018 dataset. We test its performance on Pipsqueak, a previously proposed dataset composed by skin lesion images on dark skin tones. As expected, we observe a significant drop in classification performance when the model is applied to Pipsqueak. To better understand the source of these failures, we employ explainable artificial intelligence techniques to visualize and analyze the model’s decision-making process on both datasets. Our results highlight clear differences in attention patterns and decision rationale, revealing how the lack of dark skin tone representation in the training data leads to poor generalization and biased behavior. This work emphasizes the critical role of explainable analysis in exposing and understanding model bias in clinical applications, and the necessity of inclusive datasets for fair and reliable skin lesion classification.
2025
Istituto di Nanotecnologia - NANOTEC - Sede Secondaria Rende (CS)
9783032057266
9783032057273
Melanoma Classification; Dataset Bias; Explainable AI; Skin Tone Diversity;
File in questo prodotto:
File Dimensione Formato  
ADBIS2025_Melanoma.pdf

solo utenti autorizzati

Licenza: Dominio pubblico
Dimensione 1.24 MB
Formato Adobe PDF
1.24 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/554380
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact