CNR Institutional Research Information System

We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.

Black box explanation by learning image exemplars in the latent feature space

Guidotti R;Monreale A;Matwin S;Pedreschi D

2020

Abstract

We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Lingua/e
	
				Inglese
			
	Supervisori e coordinatori esterni
	
				Brefeld U., Fromont E., Hotho A., Knobbe A., Maathuis M., Robardet C.
			
	Titolo del Volume
	
				Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019
			
	Titolo del convegno
	
				European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
			
	Da pagina
	
				189
			
	A pagina
	
				205
			
	Codice ISBN
	
				978-3-030-46149-2
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-030-46150-8_12
			
	URL
	
				https://www.springerprofessional.de/en/black-box-explanation-by-learning-image-exemplars-in-the-latent-/17945630
			
	Referee
	
				Sì, ma tipo non specificato
			
	Periodo del Convegno
	
				16-20 September, 2019
			
	Luogo del Convegno
	
				Wurzburg, Germany
			
	Parole chiave
	
				Explainable AI
Adversarial autoencoder
Image exemplars
			
	Codice Scopus
	
				2-s2.0-85084854887
			
	Codice Web of Science
	
				WOS:000718645500012
			
	Numero autori
	
				4
			
	Fulltext
	
				partially_open
			
	Tutti gli autori
	
						Guidotti, R; Monreale, A; Matwin, S; Pedreschi, D
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Identificativo progetto
	
	Titolo Progetto
	
									SoBigData Research Infrastructure
								
	Acronimo
	
									SoBigData
								
	Finanziamento
	
									H2020
								
	N. Contratto
	
									654024
								
	Titolo Progetto
	
									Big Data for Mobility Tracking  Knowledge Extraction in Urban Areas
								
	Acronimo
	
									Track and Know
								
	Finanziamento
	
									H2020
								
	N. Contratto
	
									780754
								
	Titolo Progetto
	
									PROmoting integrity in the use of RESearch results
								
	Acronimo
	
									PRO-RES
								
	Finanziamento
	
									H2020
								
	N. Contratto
	
									788352
								
	Titolo Progetto
	
									A European AI On Demand Platform and Ecosystem
								
	Acronimo
	
									AI4EU
								
	Finanziamento
	
									H2020
								
	N. Contratto
	
									825619
								
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_424498-doc_151385.pdf non disponibili Descrizione: Black box explanation by learning image exemplars in the latent feature space Tipologia: Versione Editoriale (PDF) Dimensione 3.15 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.15 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
prod_424498-doc_158544.pdf accesso aperto Descrizione: preprint Tipologia: Versione Editoriale (PDF) Dimensione 2.78 MB Formato Adobe PDF Visualizza/Apri	2.78 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/405149

Citazioni

ND

49

37

social impact