Monitoring the variation of urban expansion is crucial for sustainable urban planning and cultural heritage management. This paper proposes an approach for the semantic segmentation of very-high-resolution (VHR) satellite imagery to detect the changes in urban sprawl in the surroundings of Chan Chan, a UNESCO World Heritage Site in Peru. This study explores the effectiveness of combining Mix Transformer encoders with U-Net architectures to improve feature extraction and spatial context understanding in VHR satellite imagery. The integration of active contour loss functions further enhances the model's ability to delineate complex urban boundaries, addressing the challenges posed by the heterogeneous landscape surrounding the archaeological complex of Chan Chan. The results demonstrate that the proposed approach achieves accurate semantic segmentation on images of the study area from different years. Quantitative results showed that the U-Net-scse model with an MiTB5 encoder achieved the best performance with respect to SegFormer and FT-UNet-Former, with IoU scores of 0.8288 on OpenEarthMap and 0.6743 on Chan Chan images. Qualitative analysis revealed the model's effectiveness in segmenting buildings across diverse urban and rural environments in Peru. Utilizing this approach for monitoring urban expansion over time can enable managers to make informed decisions aimed at preserving cultural heritage and promoting sustainable urban development.

Urban Sprawl Monitoring by VHR Images Using Active Contour Loss and Improved U-Net with Mix Transformer Encoders

Colosi F.;Malinverni E. S.;
2025

Abstract

Monitoring the variation of urban expansion is crucial for sustainable urban planning and cultural heritage management. This paper proposes an approach for the semantic segmentation of very-high-resolution (VHR) satellite imagery to detect the changes in urban sprawl in the surroundings of Chan Chan, a UNESCO World Heritage Site in Peru. This study explores the effectiveness of combining Mix Transformer encoders with U-Net architectures to improve feature extraction and spatial context understanding in VHR satellite imagery. The integration of active contour loss functions further enhances the model's ability to delineate complex urban boundaries, addressing the challenges posed by the heterogeneous landscape surrounding the archaeological complex of Chan Chan. The results demonstrate that the proposed approach achieves accurate semantic segmentation on images of the study area from different years. Quantitative results showed that the U-Net-scse model with an MiTB5 encoder achieved the best performance with respect to SegFormer and FT-UNet-Former, with IoU scores of 0.8288 on OpenEarthMap and 0.6743 on Chan Chan images. Qualitative analysis revealed the model's effectiveness in segmenting buildings across diverse urban and rural environments in Peru. Utilizing this approach for monitoring urban expansion over time can enable managers to make informed decisions aimed at preserving cultural heritage and promoting sustainable urban development.
2025
Istituto di Scienze del Patrimonio Culturale - ISPC
building footprints
neural network
semantic segmentation
OpenEarthMap dataset
transformers
Chan Chan
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/547101
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact