Uncertainty quantification in multi‐class segmentation: Comparison between Bayesian and non‐Bayesian approaches in a clinical perspective

Scalco, Elisa; Pozzi, Silvia; Rizzo, Giovanna; Lanzarone, Ettore

doi:10.1002/mp.17189

Background: Automatic segmentation techniques based on Convolutional Neural Networks (CNNs) are widely adopted to automatically identify any structure of interest from a medical image, as they are not time consuming and not subject to high intra- and inter-operator variability. However, the adoption of these approaches in clinical practice is slowed down by some factors, such as the difficulty in providing an accurate quantification of their uncertainty. Purpose: This work aims to evaluate the uncertainty quantification provided by two Bayesian and two non-Bayesian approaches for a multi-class segmentation problem, and to compare the risk propensity among these approaches, considering CT images of patients affected by renal cancer (RC). Methods: Four uncertainty quantification approaches were implemented in this work, based on a benchmark CNN currently employed in medical image segmentation: two Bayesian CNNs with different regularizations (Dropout and DropConnect), named BDR and BDC, an ensemble method (Ens) and a test-time augmentation (TTA) method. They were compared in terms of segmentation accuracy, using the Dice score, uncertainty quantification, using the ratio of correct-certain pixels (RCC) and incorrect-uncertain pixels (RIU), and with respect to inter-observer variability in manual segmentation. They were trained with the Kidney and Kidney Tumor Segmentation Challenge launched in 2021 (Kits21), for which multi-class segmentations of kidney, RC, and cyst on 300 CT volumes are available. Moreover, they were tested considering this and other two public renal CT datasets. Results: Accuracy results achieved large differences across the structures of interest for all approaches, with an average Dice score of 0.92, 0.58, and 0.21 for kidney, tumor, and cyst, respectively. In terms of uncertainties, TTA provided the highest uncertainty, followed by Ens and BDC, whereas BDR provided the lowest, and minimized the number of incorrect certain pixels worse than the other approaches. Again, large differences were seen across the three structures in terms of RCC and RIU. These metrics were associated with different risk propensity, as BDR was the most risk-taking approach, able to provide higher accuracy in its prediction, but failing to assign uncertainty on incorrect segmentation in every case. The other three approaches were more conservative, providing large uncertainty regions, with the drawback of giving alert also on correct areas. Finally, the analysis of the inter-observer segmentation variability showed a significant variation among the four approaches on the external dataset, with BDR reporting the lowest agreement (Dice = 0.82), and TTA obtaining the highest score (Dice = 0.94). Conclusions: Our outcomes highlight the importance of quantifying the segmentation uncertainty and that decision-makers can choose the approach most in line with the risk propensity degree required by the application and their policy.

Uncertainty quantification in multi‐class segmentation: Comparison between Bayesian and non‐Bayesian approaches in a clinical perspective

Scalco, Elisa^Primo;Pozzi, Silvia;Rizzo, Giovanna;Lanzarone, Ettore^Ultimo

2024

Abstract

Background: Automatic segmentation techniques based on Convolutional Neural Networks (CNNs) are widely adopted to automatically identify any structure of interest from a medical image, as they are not time consuming and not subject to high intra- and inter-operator variability. However, the adoption of these approaches in clinical practice is slowed down by some factors, such as the difficulty in providing an accurate quantification of their uncertainty. Purpose: This work aims to evaluate the uncertainty quantification provided by two Bayesian and two non-Bayesian approaches for a multi-class segmentation problem, and to compare the risk propensity among these approaches, considering CT images of patients affected by renal cancer (RC). Methods: Four uncertainty quantification approaches were implemented in this work, based on a benchmark CNN currently employed in medical image segmentation: two Bayesian CNNs with different regularizations (Dropout and DropConnect), named BDR and BDC, an ensemble method (Ens) and a test-time augmentation (TTA) method. They were compared in terms of segmentation accuracy, using the Dice score, uncertainty quantification, using the ratio of correct-certain pixels (RCC) and incorrect-uncertain pixels (RIU), and with respect to inter-observer variability in manual segmentation. They were trained with the Kidney and Kidney Tumor Segmentation Challenge launched in 2021 (Kits21), for which multi-class segmentations of kidney, RC, and cyst on 300 CT volumes are available. Moreover, they were tested considering this and other two public renal CT datasets. Results: Accuracy results achieved large differences across the structures of interest for all approaches, with an average Dice score of 0.92, 0.58, and 0.21 for kidney, tumor, and cyst, respectively. In terms of uncertainties, TTA provided the highest uncertainty, followed by Ens and BDC, whereas BDR provided the lowest, and minimized the number of incorrect certain pixels worse than the other approaches. Again, large differences were seen across the three structures in terms of RCC and RIU. These metrics were associated with different risk propensity, as BDR was the most risk-taking approach, able to provide higher accuracy in its prediction, but failing to assign uncertainty on incorrect segmentation in every case. The other three approaches were more conservative, providing large uncertainty regions, with the drawback of giving alert also on correct areas. Finally, the analysis of the inter-observer segmentation variability showed a significant variation among the four approaches on the external dataset, with BDR reporting the lowest agreement (Dice = 0.82), and TTA obtaining the highest score (Dice = 0.94). Conclusions: Our outcomes highlight the importance of quantifying the segmentation uncertainty and that decision-makers can choose the approach most in line with the risk propensity degree required by the application and their policy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Strutture organizzative
	
				Istituto di Tecnologie Biomediche - ITB
Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato - STIIMA (ex ITIA)
			
	Parole chiave
	
				Bayesian convolutional neural network
Monte Carlo dropout and dropconnect
multi‐class segmentation
risk propensity degree
uncertainty quantification
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Scalco - 2024 - Medical Physics - Uncertainty quantification in multi‐class segmentation Comparison between Bayesian and.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.96 MB Formato Adobe PDF Visualizza/Apri	1.96 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/485563

Citazioni

ND

0

0

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

CNR Institutional Research Information System

Uncertainty quantification in multi‐class segmentation: Comparison between Bayesian and non‐Bayesian approaches in a clinical perspective

Scalco, Elisa Primo;Pozzi, Silvia;Rizzo, Giovanna;Lanzarone, EttoreUltimo

Primo

Ultimo

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scalco, Elisa^Primo;Pozzi, Silvia;Rizzo, Giovanna;Lanzarone, Ettore^Ultimo

Scheda breve

Scheda completa

Scheda completa (DC)