Optimizing medical image segmentation using a priori knowledge in attention mechanism-enriched convolutional neural networks

Buongiorno, Rossana; Colantonio, Sara; Germanese, Danila; Ducange, Pietro

In recent years, there has been a remarkable shift in medical image segmentation, driven by the intersection of Deep Learning (DL) and medical imaging technologies. This convergence has led to significant progress, fundamentally altering how medical image analysis is approached. DL methods, notably Convolutional Neural Networks (CNNs), have played a pivotal role in this transformation by revolutionizing the field of medical image segmentation. They facilitate the automatic extraction of features from raw image data, achieving unparalleled levels of accuracy and sensitivity. However, despite these advances, persistent challenges such as computational demands, data quality and availability, interpretability, and model generalization hinder the broad adoption of DL models in clinical environments. Moreover, while CNNs manage to autonomously extract and analyze image features with a good level of detail, they often struggle to identify regions in images that exhibit complexities that are challenging even to the human eye. To address these issues, attention and recurrence mechanisms have been introduced. The former enhances the network's ability to focus on relevant regions in the image while ignoring irrelevant background, whereas the latter studies long-range dependencies between different areas of the image to obtain broader contextual information. The first part of this doctoral thesis thoroughly examines and analyzes attention and recurrence mechanisms to determine their efficacy in binary medical image segmentation. Specifically, the objective was to identify the mechanism that strikes the optimal balance between resource utilization, data availability, and accurate segmentation outcomes for the given problem statement. The results of this analysis have shown that attention mechanisms improve segmentation accuracy by dynamically adjusting weights assigned to different image regions, and optimizing data requirements. However, effectively directing CNN's attention remained challenging in scenarios requiring a clear and precise differentiation between subtle variations crucial for accurate diagnoses. These challenges formed the basis for the second part of the thesis, which explores the integration of spatial priors into CNN architectures, specifically within a UNet-based framework enriched with the attention mechanism, namely the Attention UNet. More precisely, by incorporating prior knowledge about the spatial location of objects to be segmented, the proposed approach aims to enhance CNN effectiveness in the segmentation task. A new framework, called SPI-net, was designed for this purpose. SPI-net features an Attention-UNet as a backbone, an upstream block aimed at obtaining spatial prior, and an additional novel branch featuring long skip connections to inject nuanced context-aware information into the decoding pathway of the network. This improves its understanding of underlying structures and enhances segmentation accuracy. The experimental application and evaluation of SPI-net focused on the segmentation of COVID-19 infections, leveraging prior knowledge of disease spatial location to guide CNN attention. The results demonstrate the efficacy of SPI-net in accurately delineating disease patterns, outperforming traditional segmentation approaches. The comparative analysis highlights the limitations of conventional pre-processing operations, emphasizing the importance of integrating spatial priors into CNN architectures. Overall, this research contributes to the advancement of medical image segmentation by implicitly incorporating prior knowledge into CNNs, offering insights and empirical evidence to enhance segmentation accuracy and interpretability. The findings extend beyond COVID-19 segmentation, offering a promising framework for various medical imaging applications and contributing to the evolution of CNNs as reliable tools in healthcare diagnostics.

Optimizing medical image segmentation using a priori knowledge in attention mechanism-enriched convolutional neural networks / Buongiorno, Rossana; Colantonio, Sara; Germanese, Danila; Ducange, Pietro. - ELETTRONICO. - (2024).