Breast cancer classification through ultrasound imaging poses a significant challenge due to the inherent noise present in ultrasound images. The radiologist’s reporting process aims to assess the lesions within the images following the Breast Imaging-Reporting and Data System (BI-RADS). This work investigates whether the medical knowledge, represented by the BI-RADS information, augmented by pixel-based quantitative features, can improve breast cancer classification. Machine learning classifiers, including XGBoost, Random Forest, and Support Vector Machine, were trained with an intelligible multimodal signature composed of the BI-RADS and radiomic features. Exploiting the intrinsic interpretability of our model input, the work aims to obtain an explainable predictive model using post-hoc explanation methods. A proprietary dataset composed of 237 B-mode ultrasound scans was acquired, and a total of 103 radiomics features were extracted. Before the training of classifiers, a pipeline for selecting an informative and non-redundant signature was implemented. A 10-fold Cross-Validation repeated 20 times was considered for the training in 80% of the dataset, and the best model in terms of accuracy was selected for testing on the remaining 20%. Our results prove how the medical knowledge, represented by the BI-RADS information, is enhanced with the use of radiomic features. XGBoost was the best model, showing an AUROC of 0.977 .± 0.029 and 0.956 in the training and test phases, respectively. In addition, the implemented global explanation using the SHAP method and exploiting the intelligibility of radiomic features, allowed us to confirm some important model findings.

Breast Cancer Malignancy Prediction through Explainable Models based on a Multimodal Signature of Features

Carmelo Militello
Secondo
;
Salvatore Vitabile
Ultimo
2025

Abstract

Breast cancer classification through ultrasound imaging poses a significant challenge due to the inherent noise present in ultrasound images. The radiologist’s reporting process aims to assess the lesions within the images following the Breast Imaging-Reporting and Data System (BI-RADS). This work investigates whether the medical knowledge, represented by the BI-RADS information, augmented by pixel-based quantitative features, can improve breast cancer classification. Machine learning classifiers, including XGBoost, Random Forest, and Support Vector Machine, were trained with an intelligible multimodal signature composed of the BI-RADS and radiomic features. Exploiting the intrinsic interpretability of our model input, the work aims to obtain an explainable predictive model using post-hoc explanation methods. A proprietary dataset composed of 237 B-mode ultrasound scans was acquired, and a total of 103 radiomics features were extracted. Before the training of classifiers, a pipeline for selecting an informative and non-redundant signature was implemented. A 10-fold Cross-Validation repeated 20 times was considered for the training in 80% of the dataset, and the best model in terms of accuracy was selected for testing on the remaining 20%. Our results prove how the medical knowledge, represented by the BI-RADS information, is enhanced with the use of radiomic features. XGBoost was the best model, showing an AUROC of 0.977 .± 0.029 and 0.956 in the training and test phases, respectively. In addition, the implemented global explanation using the SHAP method and exploiting the intelligibility of radiomic features, allowed us to confirm some important model findings.
2025
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR - Sede Secondaria Palermo
9783031907135
9783031907142
Explainable model, Interpretable features, Breast cancer, Multimodal signature, Clinical features, Radiomic features
File in questo prodotto:
File Dimensione Formato  
Breast Cancer Malignancy Prediction.pdf

solo utenti autorizzati

Descrizione: manoscritto
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 606.75 kB
Formato Adobe PDF
606.75 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/543941
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact