We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.

Black box explanation by learning image exemplars in the latent feature space

Guidotti R;
2020

Abstract

We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.
2020
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
Brefeld U., Fromont E., Hotho A., Knobbe A., Maathuis M., Robardet C.
Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
189
205
978-3-030-46149-2
https://www.springerprofessional.de/en/black-box-explanation-by-learning-image-exemplars-in-the-latent-/17945630
Sì, ma tipo non specificato
16-20 September, 2019
Wurzburg, Germany
Explainable AI
Adversarial autoencoder
Image exemplars
4
partially_open
Guidotti, R; Monreale, A; Matwin, S; Pedreschi, D
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
   SoBigData Research Infrastructure
   SoBigData
   H2020
   654024

   Big Data for Mobility Tracking Knowledge Extraction in Urban Areas
   Track and Know
   H2020
   780754

   PROmoting integrity in the use of RESearch results
   PRO-RES
   H2020
   788352

   A European AI On Demand Platform and Ecosystem
   AI4EU
   H2020
   825619
File in questo prodotto:
File Dimensione Formato  
prod_424498-doc_151385.pdf

non disponibili

Descrizione: Black box explanation by learning image exemplars in the latent feature space
Tipologia: Versione Editoriale (PDF)
Dimensione 3.15 MB
Formato Adobe PDF
3.15 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_424498-doc_158544.pdf

accesso aperto

Descrizione: preprint
Tipologia: Versione Editoriale (PDF)
Dimensione 2.78 MB
Formato Adobe PDF
2.78 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/405149
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 49
  • ???jsp.display-item.citation.isi??? 37
social impact