CNR Institutional Research Information System

Machine learning (ML) offers promising capabilities for predicting rail infrastructure failure and enabling a shift from diagnostic to prognostic railway maintenance. However, the real-world adoption of high-performing ML models in safety-critical domains such as railway systems hinges on their trustworthiness, particularly their interpretability and transparency. This study, based on a case study in track geometry management, explores the trade-off between accuracy and interpretability in predicting track alignment failures by comparing six ML classifiers: Logistic Regression, Random Forest, Gradient Boosting, XGBoost, Support Vector Machine (SVM), and a Neural Network (NN). The models were trained on railway defect datasets using features such as operating speed, train traffic, total gross tonnage, and defect length. Performance was evaluated using recall as the primary metric, given the high cost of false negatives in rail safety contexts. Results showed that SVM and NN models achieved the highest recall (0.704 and 0.734, respectively), but at the cost of lower interpretability. To address this, post-hoc Explainable AI (XAI) techniques, including SHAP and LIME, were applied. These methods collectively enhance both local and global interpretability and support model transparency, stakeholder trust, and the bridging of the gap between predictive performance and decision-making needs. While XAI is increasingly applied in other sectors, its use in asset management and particularly railway predictive maintenance remains limited. This work fills that gap by demonstrating how XAI can foster more informed and confident adoption of ML models in rail infrastructure management. These explainability techniques help domain experts and end users understand why a model produced a specific result and what key factors influenced that decision, while also supporting data scientists and developers in refining model performance. For instance, feature refinement guided by SHAP improved SVM recall from 0.704 to 0.716.

Trade-Off Between Interpretability and Accuracy: How Can XAI Build Trust in Track Geometry Predictive Maintenance?

Mansouri S. A.;Dziedzic R.;Licciardello R.;Abdi Goudarzi S.;Reno' V.;Cardellicchio A.;Nitti M.

2026

Abstract

Machine learning (ML) offers promising capabilities for predicting rail infrastructure failure and enabling a shift from diagnostic to prognostic railway maintenance. However, the real-world adoption of high-performing ML models in safety-critical domains such as railway systems hinges on their trustworthiness, particularly their interpretability and transparency. This study, based on a case study in track geometry management, explores the trade-off between accuracy and interpretability in predicting track alignment failures by comparing six ML classifiers: Logistic Regression, Random Forest, Gradient Boosting, XGBoost, Support Vector Machine (SVM), and a Neural Network (NN). The models were trained on railway defect datasets using features such as operating speed, train traffic, total gross tonnage, and defect length. Performance was evaluated using recall as the primary metric, given the high cost of false negatives in rail safety contexts. Results showed that SVM and NN models achieved the highest recall (0.704 and 0.734, respectively), but at the cost of lower interpretability. To address this, post-hoc Explainable AI (XAI) techniques, including SHAP and LIME, were applied. These methods collectively enhance both local and global interpretability and support model transparency, stakeholder trust, and the bridging of the gap between predictive performance and decision-making needs. While XAI is increasingly applied in other sectors, its use in asset management and particularly railway predictive maintenance remains limited. This work fills that gap by demonstrating how XAI can foster more informed and confident adoption of ML models in rail infrastructure management. These explainability techniques help domain experts and end users understand why a model produced a specific result and what key factors influenced that decision, while also supporting data scientists and developers in refining model performance. For instance, feature refinement guided by SHAP improved SVM recall from 0.704 to 0.716.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato - STIIMA (ex ITIA) Sede Secondaria Bari
			
	Codice ISBN
	
				9783032107619
9783032107626
			
	Parole chiave
	
				Railway Predictive Maintenance, Track Geometry Defects,  Model Trustworthiness, Explainable AI (XAI)
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
978-3-032-10762-6_7.pdf solo utenti autorizzati Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 2 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/562581

Citazioni

ND

0

ND

social impact