CNR Institutional Research Information System

Post-hoc explainability is essential for understanding black-box machine learning models. Surrogate-based techniques are widely used for local and global model-agnostic explanations but have significant limitations. Local surrogates capture non-linearities but are computationally expensive and sensitive to parameters, while global surrogates are more efficient but struggle with complex local behaviors. In this paper, we present ILLUME, a flexible and interpretable framework grounded in representation learning, that can be integrated with various surrogate models to provide explanations for any black-box classifier. Specifically, our approach combines a globally trained surrogate with instance-specific linear transformations learned with a meta-encoder to generate both local and global explanations. Through extensive empirical evaluations, we demonstrate the effectiveness of ILLUME in producing feature attributions and decision rules that are not only accurate but also robust and computationally efficient, thus providing a unified explanation framework that effectively addresses the limitations of traditional surrogate methods.

Explanations go linear: post-hoc explainability for tabular data with interpretable meta-encoding

Piaggesi S.;Guidotti R.;Giannotti F.;Pedreschi D.

2025

Abstract

Post-hoc explainability is essential for understanding black-box machine learning models. Surrogate-based techniques are widely used for local and global model-agnostic explanations but have significant limitations. Local surrogates capture non-linearities but are computationally expensive and sensitive to parameters, while global surrogates are more efficient but struggle with complex local behaviors. In this paper, we present ILLUME, a flexible and interpretable framework grounded in representation learning, that can be integrated with various surrogate models to provide explanations for any black-box classifier. Specifically, our approach combines a globally trained surrogate with instance-specific linear transformations learned with a meta-encoder to generate both local and global explanations. Through extensive empirical evaluations, we demonstrate the effectiveness of ILLUME in producing feature attributions and decision rules that are not only accurate but also robust and computationally efficient, thus providing a unified explanation framework that effectively addresses the limitations of traditional surrogate methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Codice ISBN
	
				979-8-3315-9599-9
			
	Parole chiave
	
				Decision trees
Feature learning
Linear models
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Piaggesi et al_Explanations_Go_Linear_Post-Hoc_Explainability_VOR.pdf solo utenti autorizzati Descrizione: Explanations Go Linear: Post-Hoc Explainability for Tabular Data with Interpretable Meta-Encoding Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 667.02 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	667.02 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
Piaggesi et al_Explanations_Go_Linear_Post-Hoc_Explainability_postprint.pdf accesso aperto Descrizione: Explanations Go Linear: Post-hoc Explainability for Tabular Data with Interpretable Meta-Encoding Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 1.19 MB Formato Adobe PDF Visualizza/Apri	1.19 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/582246

Citazioni

ND

0

0

social impact