Research in text classification (a.k.a. verbatim coding) mostly focuses on the design of software systems for classifying large amounts of uncoded data. Some involve a training phase, whereby a text classifier "learns" to code verbatims from manually coded examples. Scant attention has been given to designing software that supports what often come next: further human editing and even correction of the data to reduce classification errors. In this presentation I will present recent research aimed at optimizing the amount of human inspection effort needed to reduce the classification error down to a desired level. The fact that, for many applications, false positives and false negatives weigh differently on what one perceives "error" to be, calls for an approach to this task based on utility theory.

Machine learning and automatic text classification: what's next?

Sebastiani F.
2013

Abstract

Research in text classification (a.k.a. verbatim coding) mostly focuses on the design of software systems for classifying large amounts of uncoded data. Some involve a training phase, whereby a text classifier "learns" to code verbatims from manually coded examples. Scant attention has been given to designing software that supports what often come next: further human editing and even correction of the data to reduce classification errors. In this presentation I will present recent research aimed at optimizing the amount of human inspection effort needed to reduce the classification error down to a desired level. The fact that, for many applications, false positives and false negatives weigh differently on what one perceives "error" to be, calls for an approach to this task based on utility theory.
2013
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Survey coding
Text classification
Utility theory
File in questo prodotto:
File Dimensione Formato  
prod_277443-doc_78954.pdf

accesso aperto

Descrizione: Machine learning and automatic text classification: what's next?
Dimensione 896.74 kB
Formato Adobe PDF
896.74 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/249654
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact