Breast cancer is the most commonly diagnosed cancer and registers the highest number of deaths for women. Advances indiagnostic activities combined with large-scale screening policies have significantly lowered themortality rates for breast cancerpatients. However, the manual inspection of tissue slides by pathologists is cumbersome, time-consuming, and is subject tosignificant inter- and intra-observer variability. Recently, the advent of whole-slide scanning systems has empowered the rapiddigitization of pathology slides, and enabled the development of Artificial Intelligence (AI)-assisted digital workflows. However, AItechniques, especially Deep Learning (DL), require a large amount of high-quality annotated data to learn from. Constructingsuch task-specific datasets poses several challenges, such as data-acquisition level constraints, time-consuming and expensiveannotations, and anonymization of patient information. In this paper, we introduce the BReAst Carcinoma Subtyping (BRACS)dataset, a large cohort of annotated Hematoxylin & Eosin (H&E)-stained images to advance AI development in the automaticcharacterization of breast lesions. BRACS contains 547Whole-Slide Images (WSIs), and 4539 Regions of Interest (RoIs) extractedfromtheWSIs. EachWSI, and respective RoIs, are annotated by the consensus of three board-certified pathologists into differentlesion categories. Specifically, BRACS includes three lesion types, i.e., benign, malignant and atypical, which are further subtypedinto seven categories. It is, to the best of our knowledge, the largest annotated dataset for breast cancer subtyping both atWSIandRoI-level. Further, by including the understudied atypical lesions, BRACS offers an unique opportunity for leveraging AI tobetter understand their characteristics. We encourage AI practitioners to develop and evaluate novel algorithms on the BRACSdataset to further breast cancer diagnosis and patient care.

BRACS: A Dataset for BReAst Carcinoma Subtyping in H&E Histology Images

N Brancati;D Riccio;G De Pietro;M Frucci
2022

Abstract

Breast cancer is the most commonly diagnosed cancer and registers the highest number of deaths for women. Advances indiagnostic activities combined with large-scale screening policies have significantly lowered themortality rates for breast cancerpatients. However, the manual inspection of tissue slides by pathologists is cumbersome, time-consuming, and is subject tosignificant inter- and intra-observer variability. Recently, the advent of whole-slide scanning systems has empowered the rapiddigitization of pathology slides, and enabled the development of Artificial Intelligence (AI)-assisted digital workflows. However, AItechniques, especially Deep Learning (DL), require a large amount of high-quality annotated data to learn from. Constructingsuch task-specific datasets poses several challenges, such as data-acquisition level constraints, time-consuming and expensiveannotations, and anonymization of patient information. In this paper, we introduce the BReAst Carcinoma Subtyping (BRACS)dataset, a large cohort of annotated Hematoxylin & Eosin (H&E)-stained images to advance AI development in the automaticcharacterization of breast lesions. BRACS contains 547Whole-Slide Images (WSIs), and 4539 Regions of Interest (RoIs) extractedfromtheWSIs. EachWSI, and respective RoIs, are annotated by the consensus of three board-certified pathologists into differentlesion categories. Specifically, BRACS includes three lesion types, i.e., benign, malignant and atypical, which are further subtypedinto seven categories. It is, to the best of our knowledge, the largest annotated dataset for breast cancer subtyping both atWSIandRoI-level. Further, by including the understudied atypical lesions, BRACS offers an unique opportunity for leveraging AI tobetter understand their characteristics. We encourage AI practitioners to develop and evaluate novel algorithms on the BRACSdataset to further breast cancer diagnosis and patient care.
2022
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Histology Images
breast cancer
classificat
image dataset
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/429182
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 31
  • ???jsp.display-item.citation.isi??? ND
social impact