Mapping environmental variables is crucial for natural resource management. Researchers and scholars have continually advanced this field with modern techniques such as Integrated Nested Laplace Approximation (INLA), Deep Learning (DL), and Graph Neural Networks (GNN) models. While effective, these models often present a significant challenge due to their black nature, which obscures the process of generating final maps from raw data. Recent theoretical breakthroughs have shown that white/grey-box models can achieve the same level of accuracy as these advanced techniques, debunking the belief that complex models are necessarily the most accurate. Based on these findings, we have developed a methodology that employs a series of statistical tests and data analytics to identify essential features hidden in spatial data in order to assess the predictive model (of white/grey kind) that best approximates underlying spatial processes. This methodology profiles the model that better adapts to the data, aiding in the selection of the simplest model that achieves the desired accuracy, functioning similarly to a recommender system for model selection. Furthermore, the set of permissible models includes only regressive-like ones to clarify the data's contribution to map construction and can be applied to a wide range of datasets. By reducing complexity, this approach enhances the transparency of the model's results. Real-world dataset demonstrates this methodology's remarkable ability to produce highly accurate results.

Building the optimal hybrid spatial Data-Driven Model: Balancing accuracy and complexity

Emanuele Barca
Primo
Methodology
;
Maria Clementina Caputo
Penultimo
Data Curation
;
Rita Masciale
Ultimo
Supervision
2025

Abstract

Mapping environmental variables is crucial for natural resource management. Researchers and scholars have continually advanced this field with modern techniques such as Integrated Nested Laplace Approximation (INLA), Deep Learning (DL), and Graph Neural Networks (GNN) models. While effective, these models often present a significant challenge due to their black nature, which obscures the process of generating final maps from raw data. Recent theoretical breakthroughs have shown that white/grey-box models can achieve the same level of accuracy as these advanced techniques, debunking the belief that complex models are necessarily the most accurate. Based on these findings, we have developed a methodology that employs a series of statistical tests and data analytics to identify essential features hidden in spatial data in order to assess the predictive model (of white/grey kind) that best approximates underlying spatial processes. This methodology profiles the model that better adapts to the data, aiding in the selection of the simplest model that achieves the desired accuracy, functioning similarly to a recommender system for model selection. Furthermore, the set of permissible models includes only regressive-like ones to clarify the data's contribution to map construction and can be applied to a wide range of datasets. By reducing complexity, this approach enhances the transparency of the model's results. Real-world dataset demonstrates this methodology's remarkable ability to produce highly accurate results.
2025
Istituto per le Tecnologie della Costruzione - ITC - Sede Secondaria Bari
Fundamental spatial assumption
Hybrid model
Recommender system
Spatial trend
Stationary stochastic model
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1569843225001256-main_compressed.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 842.49 kB
Formato Adobe PDF
842.49 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/541642
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact