CNR Institutional Research Information System

In practical settings, classification datasets are obtained through a labelling process that is usually done by humans. Labels can be noisy as they are obtained by aggregating the different individual labels assigned to the same sample by multiple, and possibly disagreeing, annotators. The interrater agreement on these datasets can be measured while the underlying noise distribution to which the labels are subject is assumed to be unknown. In this work, we: (i) show how to leverage the inter-annotator statistics to estimate the noise distribution to which labels are subject; (ii) introduce methods that use the estimate of the noise distribution to learn from the noisy dataset; and (iii) establish generalization bounds in the empirical risk minimization framework that depend on the estimated quantities. We conclude the paper by providing experiments that illustrate our findings.

Leveraging inter-rater agreement for classification in the presence of noisy labels

Bucarelli MS;Cassano L;Siciliano F;Mantrach A;Silvestri F

2023

Abstract

In practical settings, classification datasets are obtained through a labelling process that is usually done by humans. Labels can be noisy as they are obtained by aggregating the different individual labels assigned to the same sample by multiple, and possibly disagreeing, annotators. The interrater agreement on these datasets can be measured while the underlying noise distribution to which the labels are subject is assumed to be unknown. In this work, we: (i) show how to leverage the inter-annotator statistics to estimate the noise distribution to which labels are subject; (ii) introduce methods that use the estimate of the noise distribution to learn from the noisy dataset; and (iii) establish generalization bounds in the empirical risk minimization framework that depend on the estimated quantities. We conclude the paper by providing experiments that illustrate our findings.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Lingua/e
	
				Inglese
			
	Titolo del Volume
	
				2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition - CVPR 2023
			
	Serie
	
				IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
			
	Titolo del convegno
	
				CVPR - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition
			
	Da pagina
	
				3439
			
	A pagina
	
				3448
			
	Codice ISBN
	
				979-8-3503-0129-8
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CVPR52729.2023.00335
			
	URL
	
				https://ieeexplore.ieee.org/document/10203489
			
	Periodo del Convegno
	
				17-24/06/2023
			
	Luogo del Convegno
	
				Vancouver, CANADA
			
	Parole chiave
	
				Machine learning
			
	Codice Scopus
	
				2-s2.0-85167949498
			
	Codice Web of Science
	
				WOS:001058542603070
			
	Numero autori
	
				0
			
	Fulltext
	
				partially_open
			
	Tutti gli autori
	
						Bucarelli M.S.; Cassano L.; Siciliano F.; Mantrach A.; Silvestri F.
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Identificativo progetto
	
	Titolo Progetto
	
									SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics
								
	Acronimo
	
									SoBigData-PlusPlus
								
	Finanziamento
	
									H2020
								
	N. Contratto
	
									871042
								
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_488368-doc_203151.pdf solo utenti autorizzati Descrizione: Leveraging inter-rater agreement for classification in the presence of noisy labels Tipologia: Versione Editoriale (PDF) Dimensione 762.64 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	762.64 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
prod_488368-doc_203152.pdf accesso aperto Descrizione: Postprint - Leveraging inter-rater agreement for classification in the presence of noisy labels Tipologia: Versione Editoriale (PDF) Dimensione 824.42 kB Formato Adobe PDF Visualizza/Apri	824.42 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/429939

Citazioni

ND

19

12

social impact