CNR Institutional Research Information System

Reconstructing a large real environment is a fundamental task to promote eXtended Reality adoption in industrial and entertainment fields. However, the short range of depth cameras, the sparsity of LiDAR sensors, and the huge computational cost of Structure-from-Motion pipelines prevent scene replication in near real time. To overcome these limitations, we introduce a spatio-temporal diffusion neural architecture, a generative AI technique that fuses temporal information (i.e., a short temporally-ordered list of color photographs, like sparse frames of a video stream) with an approximate spatial resemblance of the explored environment. Our aim is to modify an existing 3D diffusion neural model to produce a Signed Distance Field volume from which a 3D mesh representation can be extracted. Our results show that the hallucination approach of diffusion models is an effective methodology where a fast reconstruction is a crucial target.

Spatio-temporal 3D reconstruction from frame sequences and feature points

Federico G.;Carrara F.;Amato G.;Di Benedetto M.

2024

Abstract

Reconstructing a large real environment is a fundamental task to promote eXtended Reality adoption in industrial and entertainment fields. However, the short range of depth cameras, the sparsity of LiDAR sensors, and the huge computational cost of Structure-from-Motion pipelines prevent scene replication in near real time. To overcome these limitations, we introduce a spatio-temporal diffusion neural architecture, a generative AI technique that fuses temporal information (i.e., a short temporally-ordered list of color photographs, like sparse frames of a video stream) with an approximate spatial resemblance of the explored environment. Our aim is to modify an existing 3D diffusion neural model to produce a Signed Distance Field volume from which a 3D mesh representation can be extracted. Our results show that the hallucination approach of diffusion models is an effective methodology where a fast reconstruction is a crucial target.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Codice ISBN
	
				979-8-4007-1794-9
			
	Parole chiave
	
				Denoising diffusion probabilistic model, Signed distance field, 3D reconstruction, Video reconstruction, Deep Learning, Machine Learning, Artificial Intelligence
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
3672406.3672415.pdf accesso aperto Descrizione: Spatio-Temporal 3D Reconstructionfrom Frame Sequences and Feature Points Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 904.44 kB Formato Adobe PDF Visualizza/Apri	904.44 kB	Adobe PDF	Visualizza/Apri
Spatio_Temporal_3D_Reconstruction_from_Frame_Sequences_and_Feature_Points__TAPS_.pdf accesso aperto Descrizione: This is the Author Accepted Manuscript (postprint) version of the following paper: Federico G. et al., “Spatio-Temporal 3D Reconstructionfrom Frame Sequences and Feature Points”, 2024, peer-reviewed and accepted for publication in “IMXw '24: Proceedings of the 2024 ACM International Conference on Interactive Media Experiences Workshops”, DOI: 10.1145/3672406.3672415. Tipologia: Documento in Post-print Licenza: Altro tipo di licenza Dimensione 984.09 kB Formato Adobe PDF Visualizza/Apri	984.09 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/493141

Citazioni

ND

0

0

social impact