Mammographic breast density (BD) is commonly visually assessed using the Breast Imaging Reporting and Data System (BI-RADS) four-category scale. To overcome inter- and intraobserver variability of visual assessment, the authors retrospectively developed and externally validated a software for BD classification based on convolutional neural networks from mammograms obtained between 2017 and 2020. The tool was trained using the majority BD category determined by seven board-certified radiologists who independently visually assessed 760 mediolateral oblique (MLO) images in 380 women (mean age, 57 years ± 6 [SD]) from center 1; this process mimicked training from a consensus of several human readers. External validation of the model was performed by the three radiologists whose BD assessment was closest to the majority (consensus) of the initial seven on a dataset of 384 MLO images in 197 women (mean age, 56 years ± 13) obtained from center 2. The model achieved an accuracy of 89.3% in distinguishing BI-RADS a or b (nondense breasts) versus c or d (dense breasts) categories, with an agreement of 90.4% (178 of 197 mammograms) and a reliability of 0.807 (Cohen ?) compared with the mode of the three readers. This study demonstrates accuracy and reliability of a fully automated software for BD classification.

Development and Validation of an AI-driven Mammographic Breast Density Classification Tool Based on Radiologist Consensu

Castiglioni I;
2022

Abstract

Mammographic breast density (BD) is commonly visually assessed using the Breast Imaging Reporting and Data System (BI-RADS) four-category scale. To overcome inter- and intraobserver variability of visual assessment, the authors retrospectively developed and externally validated a software for BD classification based on convolutional neural networks from mammograms obtained between 2017 and 2020. The tool was trained using the majority BD category determined by seven board-certified radiologists who independently visually assessed 760 mediolateral oblique (MLO) images in 380 women (mean age, 57 years ± 6 [SD]) from center 1; this process mimicked training from a consensus of several human readers. External validation of the model was performed by the three radiologists whose BD assessment was closest to the majority (consensus) of the initial seven on a dataset of 384 MLO images in 197 women (mean age, 56 years ± 13) obtained from center 2. The model achieved an accuracy of 89.3% in distinguishing BI-RADS a or b (nondense breasts) versus c or d (dense breasts) categories, with an agreement of 90.4% (178 of 197 mammograms) and a reliability of 0.807 (Cohen ?) compared with the mode of the three readers. This study demonstrates accuracy and reliability of a fully automated software for BD classification.
2022
Istituto di Bioimmagini e Fisiologia Molecolare - IBFM
Breast; Convolutional Neural Network (CNN); Deep Learning Algorithms; Machine Learning Algorithms; Mammography.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/448813
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact