CNR Institutional Research Information System

Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data. Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community. With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods. Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence.

A benchmark dataset and workflow for landslide susceptibility zonation

Massimiliano Alvioli;Marco Loche;Liesbet Jacobs;Carlos H. Grohmann;Minu Treesa Abraham;Kunal Gupta;Neelima Satyam;Gianvito Scaringi;Txomin Bornaetxea;Mauro Rossi;Ivan Marchesini;Luigi Lombardo;Mateo Moreno;Stefan Steger;Corrado A. S. Camera;Greta Bajni;Guruh Samodra;Erwin Eko Wahyudi;Nanang Susyanto;Marko Sinčić;Sanja Bernat Gazibara;Flavius Sirbu;Jewgenij Torizin;Nick Schüßler;Benjamin B. Mirus;Jacob B. Woodard;Héctor Aguilera;Jhonatan Rivera-Rivera

2024

Abstract

Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data. Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community. With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods. Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Strutture organizzative
	
				Istituto di Ricerca per la Protezione Idrogeologica - IRPI
			
	Parole chiave
	
				Geomorphometry
Slope units
Spatial analysis
Machine learning
Statistical modeling
			
	Parole chiave
	
				Benchmark dataset
Geomorphological mapping
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
EarthSciRev-258_104927_2024_low.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 2.21 MB Formato Adobe PDF Visualizza/Apri	2.21 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/512721

Citazioni

ND

48

41

social impact