Understanding and quantifying the bias introduced by human annotation of data is a crucial problem for trustworthy supervised learning. Recently, a perspectivist trend has emerged in the NLP community, focusing on the inadequacy of previous aggregation schemes, which suppose the existence of a single ground truth. This assumption is particularly problematic for sensitive tasks involving subjective human judgments, such as toxicity detection. To address these issues, we propose a preliminary approach for bias discovery within human raters by exploring individual ratings for specific sensitive topics annotated in the texts. Our analysis's object focuses on the Jigsaw dataset, a collection of comments aiming at challenging online toxicity identification.

Bias discovery within human raters: a case study of the Jigsaw dataset

Guidotti R.;Ruggieri S.
2022

Abstract

Understanding and quantifying the bias introduced by human annotation of data is a crucial problem for trustworthy supervised learning. Recently, a perspectivist trend has emerged in the NLP community, focusing on the inadequacy of previous aggregation schemes, which suppose the existence of a single ground truth. This assumption is particularly problematic for sensitive tasks involving subjective human judgments, such as toxicity detection. To address these issues, we propose a preliminary approach for bias discovery within human raters by exploring individual ratings for specific sensitive topics annotated in the texts. Our analysis's object focuses on the Jigsaw dataset, a collection of comments aiming at challenging online toxicity identification.
2022
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
9791095546986
Bias
Fairness
Human Raters
Individual Annotations
NLP Perspectivism
Toxicity Detection
File in questo prodotto:
File Dimensione Formato  
Guidotti_LREC 2022.pdf

accesso aperto

Descrizione: Bias Discovery within Human Raters: A Case Study of the Jigsaw Dataset
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 319.52 kB
Formato Adobe PDF
319.52 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/457338
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact