Multimodal image registration aims to spatially align images from different modalities at the pixel level. However, due to the nonlinear relationship of radiation intensities caused by different imaging modalities, achieving high accuracy in multimodal image registration presents a significant challenge. Additionally, the presence of both global transformations (i.e., large-scale rigid affine transformations) and local distortions (i.e., small-scale nonrigid deformations) between paired images further complicates the registration process. This article addressed the challenge resulting from modality differences through modality distillation. Specifically, a teacher (i.e., a homomodal image registration model) is trained to guide the student (i.e., a multimodal image registration model). Besides, this article simultaneously aligned large-scale rigid and small-scale nonrigid deformations by predicting deformation flow from both global and local features, thereby achieving high-precision registration. Furthermore, this proposed method incorporated a deformation mask during training to mitigate the negative impact of black edges in the obtained registration results on model performance. Experimental results demonstrate that the proposed method delivers state-of-the-art registration accuracy across various multimodal datasets, with ablation studies confirming the effectiveness of each component.
Multimodality Image Registration With Modality Distillation
Vivone, Gemine;
2025
Abstract
Multimodal image registration aims to spatially align images from different modalities at the pixel level. However, due to the nonlinear relationship of radiation intensities caused by different imaging modalities, achieving high accuracy in multimodal image registration presents a significant challenge. Additionally, the presence of both global transformations (i.e., large-scale rigid affine transformations) and local distortions (i.e., small-scale nonrigid deformations) between paired images further complicates the registration process. This article addressed the challenge resulting from modality differences through modality distillation. Specifically, a teacher (i.e., a homomodal image registration model) is trained to guide the student (i.e., a multimodal image registration model). Besides, this article simultaneously aligned large-scale rigid and small-scale nonrigid deformations by predicting deformation flow from both global and local features, thereby achieving high-precision registration. Furthermore, this proposed method incorporated a deformation mask during training to mitigate the negative impact of black edges in the obtained registration results on model performance. Experimental results demonstrate that the proposed method delivers state-of-the-art registration accuracy across various multimodal datasets, with ablation studies confirming the effectiveness of each component.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


