CNR Institutional Research Information System

The increasing demand for transparency in machine learning has spurred the development of techniques that provide faithful explanations for complex black-box models. In this work, we introduce RaMiCo (Region Aware Minimal Counterfactual Rules), a model-agnostic method that extracts global counterfactual rules by mining instances from diverse regions of the input space. RaMiCo focuses on single-feature substitutions to generate minimal and region-aware rules that encapsulate the overall decision-making process of the target model. These global rules can be further localised to specific input instances, enabling users to obtain tailored explanations for individual predictions. Comprehensive experiments on multiple benchmark datasets demonstrate that RaMiCo achieves competitive fidelity in replicating black-box behaviour and exhibits high coverage in capturing the intrinsic structure of white-box classifiers. RaMiCo supports the development of trustworthy and secure machine learning systems by providing transparent, human-understandable explanations in the form of concise global rules. This design enables users to verify and inspect the model’s decision logic, reducing the risk of hidden biases, unintended behaviours, or adversarial exploitation. These features make RaMiCo particularly suitable for applications where the reliability, safety, and verifiability of automated decisions are essential.

Region-aware minimal counterfactual rules for model-agnostic explainable classification

Gagliardi G.;Alfeo A. L.;Guidotti R.;Cimino M. G. C. A.

2025

Abstract

The increasing demand for transparency in machine learning has spurred the development of techniques that provide faithful explanations for complex black-box models. In this work, we introduce RaMiCo (Region Aware Minimal Counterfactual Rules), a model-agnostic method that extracts global counterfactual rules by mining instances from diverse regions of the input space. RaMiCo focuses on single-feature substitutions to generate minimal and region-aware rules that encapsulate the overall decision-making process of the target model. These global rules can be further localised to specific input instances, enabling users to obtain tailored explanations for individual predictions. Comprehensive experiments on multiple benchmark datasets demonstrate that RaMiCo achieves competitive fidelity in replicating black-box behaviour and exhibits high coverage in capturing the intrinsic structure of white-box classifiers. RaMiCo supports the development of trustworthy and secure machine learning systems by providing transparent, human-understandable explanations in the form of concise global rules. This design enables users to verify and inspect the model’s decision logic, reducing the risk of hidden biases, unintended behaviours, or adversarial exploitation. These features make RaMiCo particularly suitable for applications where the reliability, safety, and verifiability of automated decisions are essential.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Counterfactual explanation
Explainable artificial intelligence
Model-agnostic explanations
Region-aware rule extraction
Rule-based explanations
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Guidotti et al_ML-2025.pdf solo utenti autorizzati Descrizione: Region-aware Minimal Counterfactual Rules for Model-agnostic Explainable Classification Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 1.79 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.79 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/557834

Citazioni

ND

0

0

social impact