CNR Institutional Research Information System

Several devastating landslides have occurred in the NW Himalayas, which has prompted several researchers to strive for improvement in landslide susceptibility modelling (LSM) methodologies. This research analyzes the effectiveness of alternative landslide partitioning techniques on LSM in the landslide-prone district, Muzaffarabad, Pakistan. We developed a landslide inventory of 961 landslides and then traditionally divided it into training (672; 70%) and testing (289; 30%) samples. These training samples (672) are processed by the Average Nearest Neighbour Index (ANNI) method to estimatethe spatial pattern of landslides in nature. The results provide an ANNI ratio of 0.672 confirming that the landslides distribution pattern is cluster in the complex Himalayan terrain of Muzaffarabad. Among 672, the majority of landslides (529; 79%) depict cluster behaviour, while 189 landslides (21%) depict random behaviour. To evaluate the effectiveness of landslide cluster samples in prediction, five machine learning algorithms (MLAs), that is, K-Nearest Neighbour (KNN), Na¨?ve Bayes (NB), Random Forest (RF), Extreme Gradient Boosting (XGBoost) and Logistic Regression (LR) using proposed cluster (529) and traditional (672) random training samples along with 17 geo-environmental factors are executed. However, testing samples (289; 30%) separated at the initial stage remained the same to check the model's effectiveness. The areas under the curve (AUC-ROC), sensitivity, specificity, Kappa index and accuracy (ACC) have been used to evaluate the MLA's performances. An alternative partitioning technique (cluster) shows the highest predictive power with AUC-ROC values ranging from 0.96 to 0.86, Kappa index ranges from 0.76 to 0.60 and ACC ranges from 0.90 to 0.83. Conversely, the random partitioning approach performs less well with AUC-ROC values ranging from 0.95 to 0.83, Kappa index ranges from 0.70 to 0.49 and ACC ranges from 0.87 to 0.80. In comparison, the RF cluster sampling-based model outperforms the other models and their counterparts. The RF model achieved the highest accuracy (0.902), highest AUC (0.962) and highest Kappa index (0.755) followed by XGboost having ACC (0.885), AUC (0.95) and Kappa index (0.724) employing proposed cluster training samples. However, traditional random training samples yield comparatively low ACC of RF (0.868) and XGboost (0.862). These results confirm that cluster training sampling performs well in obtaining reliable and precise LSMs for complex Himalayan terrain. Although randomlandslide partitioning for training datasets is seldom utilized in LSM, this study highlights that cluster partitioning for landslide training datasets might be a realistic and reliable approach.

Assessing the effectiveness of alternative landslide partitioning in machine learning methods for landslide prediction in the complex Himalayan terrain

Muhammad Tayyib Riaz;Muhammad Basharat;Maria Teresa Brunetti

2022

Abstract

Several devastating landslides have occurred in the NW Himalayas, which has prompted several researchers to strive for improvement in landslide susceptibility modelling (LSM) methodologies. This research analyzes the effectiveness of alternative landslide partitioning techniques on LSM in the landslide-prone district, Muzaffarabad, Pakistan. We developed a landslide inventory of 961 landslides and then traditionally divided it into training (672; 70%) and testing (289; 30%) samples. These training samples (672) are processed by the Average Nearest Neighbour Index (ANNI) method to estimatethe spatial pattern of landslides in nature. The results provide an ANNI ratio of 0.672 confirming that the landslides distribution pattern is cluster in the complex Himalayan terrain of Muzaffarabad. Among 672, the majority of landslides (529; 79%) depict cluster behaviour, while 189 landslides (21%) depict random behaviour. To evaluate the effectiveness of landslide cluster samples in prediction, five machine learning algorithms (MLAs), that is, K-Nearest Neighbour (KNN), Na¨?ve Bayes (NB), Random Forest (RF), Extreme Gradient Boosting (XGBoost) and Logistic Regression (LR) using proposed cluster (529) and traditional (672) random training samples along with 17 geo-environmental factors are executed. However, testing samples (289; 30%) separated at the initial stage remained the same to check the model's effectiveness. The areas under the curve (AUC-ROC), sensitivity, specificity, Kappa index and accuracy (ACC) have been used to evaluate the MLA's performances. An alternative partitioning technique (cluster) shows the highest predictive power with AUC-ROC values ranging from 0.96 to 0.86, Kappa index ranges from 0.76 to 0.60 and ACC ranges from 0.90 to 0.83. Conversely, the random partitioning approach performs less well with AUC-ROC values ranging from 0.95 to 0.83, Kappa index ranges from 0.70 to 0.49 and ACC ranges from 0.87 to 0.80. In comparison, the RF cluster sampling-based model outperforms the other models and their counterparts. The RF model achieved the highest accuracy (0.902), highest AUC (0.962) and highest Kappa index (0.755) followed by XGboost having ACC (0.885), AUC (0.95) and Kappa index (0.724) employing proposed cluster training samples. However, traditional random training samples yield comparatively low ACC of RF (0.868) and XGboost (0.862). These results confirm that cluster training sampling performs well in obtaining reliable and precise LSMs for complex Himalayan terrain. Although randomlandslide partitioning for training datasets is seldom utilized in LSM, this study highlights that cluster partitioning for landslide training datasets might be a realistic and reliable approach.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Strutture organizzative
	
				Istituto di Ricerca per la Protezione Idrogeologica - IRPI
			
	Parole chiave
	
				Prediction performance
landslide partitioning
average nearest neighbour index
random forest
machine learning
Muzaffarabad
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_469123-doc_189854.pdf solo utenti autorizzati Descrizione: Article Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 5.89 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	5.89 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/419173

Citazioni

ND

25

21

social impact