CNR Institutional Research Information System

Pansharpening is a vital process that aims to obtain high-resolution multispectral (HRMS) images by fusing panchromatic (PAN) and low-resolution multispectral images. With the advancement of deep learning (DL), data-driven pansharpening methods have been developed extensively, demonstrating superior performance compared to traditional approaches. However, most current DL-based studies still struggle to effectively preserve spectral properties and adequately capture spatial details, and fail to comprehensively integrate complementary information across modalities, leading to suboptimal results. To address these challenges, we propose an innovative cross-modal information aggregation network (CMIAN) with feature enhancement (FE) for pansharpening. The CMIAN comprises three core components: an FE module that enhances feature representation of both modalities through a simplify-and-enhance approach, a cross-modal feature aggregation module that aggregates intramodal features based on the characteristic differences between MS and PAN images, and a cross-modal information reconstruction module that adaptively balances large-scale features and local details of PAN images, and performs image reconstruction to yield desirable pansharpening outcomes. Experiments on the QuickBird, WorldView-2, and WorldView-3 datasets demonstrate the effectiveness and superiority of our proposed CMIAN. On the WorldView-3 dataset, for instance, our CMIAN outperforms the second-best method by 5.95% in mean peak signal-to-noise ratio and 9.41% in spectral angle mapper.

Cross-Modal Information Aggregation Network With Feature Enhancement for Pansharpening

Jian, Lihua;Chen, Yuanyuan;Vivone, Gemine^Penultimo;Zheng, Yuxuan

2026

Abstract

Pansharpening is a vital process that aims to obtain high-resolution multispectral (HRMS) images by fusing panchromatic (PAN) and low-resolution multispectral images. With the advancement of deep learning (DL), data-driven pansharpening methods have been developed extensively, demonstrating superior performance compared to traditional approaches. However, most current DL-based studies still struggle to effectively preserve spectral properties and adequately capture spatial details, and fail to comprehensively integrate complementary information across modalities, leading to suboptimal results. To address these challenges, we propose an innovative cross-modal information aggregation network (CMIAN) with feature enhancement (FE) for pansharpening. The CMIAN comprises three core components: an FE module that enhances feature representation of both modalities through a simplify-and-enhance approach, a cross-modal feature aggregation module that aggregates intramodal features based on the characteristic differences between MS and PAN images, and a cross-modal information reconstruction module that adaptively balances large-scale features and local details of PAN images, and performs image reconstruction to yield desirable pansharpening outcomes. Experiments on the QuickBird, WorldView-2, and WorldView-3 datasets demonstrate the effectiveness and superiority of our proposed CMIAN. On the WorldView-3 dataset, for instance, our CMIAN outperforms the second-best method by 5.95% in mean peak signal-to-noise ratio and 9.41% in spectral angle mapper.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Metodologie per l'Analisi Ambientale - IMAA
			
	Parole chiave
	
				Cross-modal information aggregation
deep learning (DL)
feature enhancement (FE)
pansharpening
remote sensing image fusion
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/564431

Citazioni

ND

1

0

social impact