CNR Institutional Research Information System

The rapid advancement of deep neural networks (DNNs) has substantially progressed image-to-image translation, yielding numerous sophisticated methods. However, most existing methods face not only the inherent pixel-level spatial misalignment resulting from divergent imaging perspectives, but also the local geometric distortion and structural incoherence stemming from inadequate cross-modal feature alignment. To address this issue, we propose CycleMamba, a cycle-consistent learning-based aerial visible-to-infrared image translation framework, which enforces geometric constraints and semantic space alignment through globally-aware bidirectional transformation, thereby alleviating pixel-level misalignment and structural distortion. Specifically, inspired by the selective structured state-space model (Mamba), a bidirectional cross-modal translation network based on Multi-Granularity U-shaped Translators (MGUTs) is constructed, which integrates Mamba’s long-range modeling with CNN’s local feature extraction strengths. Regarding the stability of cyclic consistency learning, a dual-stage progressive training mechanism is developed for visible-infrared-visible translation. Additionally, to enhance the alignment of cross-modal features and structural preservation, the cycle consistency constraints that collaborate with structural similarity and semantic consistency losses are given to reduce spatial and semantic misalignment, facilitating fidelity. Comparative experiments with state-of-the-art methods are conducted on three public datasets. Experimental results demonstrate that CycleMamba achieves superior translation performance. Extensive ablation studies further evaluate the effectiveness of the proposed method. The code will be available at https://github.com/xzhichaox/CycleMamba.

CycleMamba: Cycle-Consistent Learning for Aerial Visible-to-Infrared Image Translation

Sui, Chenhong;Xu, Zhichao;Meng, Yu;Wang, Haipeng;Zhang, Bing;Vivone, Gemine^Ultimo

2026

Abstract

The rapid advancement of deep neural networks (DNNs) has substantially progressed image-to-image translation, yielding numerous sophisticated methods. However, most existing methods face not only the inherent pixel-level spatial misalignment resulting from divergent imaging perspectives, but also the local geometric distortion and structural incoherence stemming from inadequate cross-modal feature alignment. To address this issue, we propose CycleMamba, a cycle-consistent learning-based aerial visible-to-infrared image translation framework, which enforces geometric constraints and semantic space alignment through globally-aware bidirectional transformation, thereby alleviating pixel-level misalignment and structural distortion. Specifically, inspired by the selective structured state-space model (Mamba), a bidirectional cross-modal translation network based on Multi-Granularity U-shaped Translators (MGUTs) is constructed, which integrates Mamba’s long-range modeling with CNN’s local feature extraction strengths. Regarding the stability of cyclic consistency learning, a dual-stage progressive training mechanism is developed for visible-infrared-visible translation. Additionally, to enhance the alignment of cross-modal features and structural preservation, the cycle consistency constraints that collaborate with structural similarity and semantic consistency losses are given to reduce spatial and semantic misalignment, facilitating fidelity. Comparative experiments with state-of-the-art methods are conducted on three public datasets. Experimental results demonstrate that CycleMamba achieves superior translation performance. Extensive ablation studies further evaluate the effectiveness of the proposed method. The code will be available at https://github.com/xzhichaox/CycleMamba.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Metodologie per l'Analisi Ambientale - IMAA
			
	Parole chiave
	
				image translation
			
	Parole chiave
	
				visible-to-infrared
cross-modal
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/564464

Citazioni

ND

ND

ND

social impact