CNR Institutional Research Information System

Deep learning has recently become the state of the art in many computer vision applications and in image classification in particular. However, recent works have shown that it is quite easy to create adversarial examples, i.e., images intentionally created or modified to cause the deep neural network to make a mistake. They are like optical illusions for machines containing changes unnoticeable to the human eye. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.

Detecting adversarial example attacks to deep neural networks

Carrara F;Falchi F;Caldelli R;Amato G;Fumarola R;Becarelli R

2017

Abstract

Deep learning has recently become the state of the art in many computer vision applications and in image classification in particular. However, recent works have shown that it is quite easy to create adversarial examples, i.e., images intentionally created or modified to cause the deep neural network to make a mistake. They are like optical illusions for machines containing changes unnoticeable to the human eye. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Lingua/e
	
				Inglese
			
	Titolo del convegno
	
				CBMI '17 - 15th International Workshop on Content-Based Multimedia Indexing
			
	Numero di pagine
	
				7
			
	Codice ISBN
	
				978-1-4503-5333-5
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3095713.3095753
			
	URL
	
				https://dl.acm.org/citation.cfm?id=3095713.3095753
			
	Nome Editore
	
				ACM Press
			
	Città Editore
	
				New York
			
	Nazione Editore
	
				STATI UNITI D'AMERICA
			
	Referee
	
				Sì, ma tipo non specificato
			
	Periodo del Convegno
	
				19-21 June 2017
			
	Luogo del Convegno
	
				Firenze, Italy
			
	Parole chiave
	
				Adversarial Images Detection
Deep Convolutional Neural Network
Machine Learning Security
			
	Codice Scopus
	
				2-s2.0-85030762303
			
	Codice Web of Science
	
				WOS:000426964400038
			
	Numero autori
	
				3
			
	Fulltext
	
				partially_open
			
	Tutti gli autori
	
						Carrara F.; Falchi F.; Caldelli R.; Amato G.; Fumarola R.; Becarelli R.
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_384736-doc_131706.pdf solo utenti autorizzati Descrizione: Detecting adversarial example attacks to deep neural networks Tipologia: Versione Editoriale (PDF) Dimensione 2.38 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.38 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
prod_384736-doc_159999.pdf accesso aperto Descrizione: Detecting adversarial example attacks to deep neural networks Tipologia: Versione Editoriale (PDF) Dimensione 2.39 MB Formato Adobe PDF Visualizza/Apri	2.39 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/346784

Citazioni

ND

23

21

social impact