CNR Institutional Research Information System

In the last six years I have led projects aimed at developing software that learns how to code open-ended survey data from data manually coded by humans. These projects have led to the development of software now in operation at the Customer Satisfaction division of a large international banking group, and now integrated into a major software platform for the management of open-ended survey data. This software, which can code data at a rate of tens of thousands responses per hour, is the result of contributions from different fields of computer science, including Information Retrieval, Machine Learning, Computational Linguistics, and Opinion Mining. In this talk I will discuss the basic philosophy underlying this software, I will present the results of experiments we have run on several datasets of respondent data in which we compare the accuracy of the software against the accuracy of human coders, and I will argue for a notion of "accuracy" defined in terms of inter-coder agreement rates. Finally, I will discuss the kind of characteristics that make a survey more or less amenable to automated coding by means of our system.

Machines that learn how to code open-ended survey data: underlying principles, experimental data, and methodological issues

Sebastiani F

2008

Abstract

In the last six years I have led projects aimed at developing software that learns how to code open-ended survey data from data manually coded by humans. These projects have led to the development of software now in operation at the Customer Satisfaction division of a large international banking group, and now integrated into a major software platform for the management of open-ended survey data. This software, which can code data at a rate of tens of thousands responses per hour, is the result of contributions from different fields of computer science, including Information Retrieval, Machine Learning, Computational Linguistics, and Opinion Mining. In this talk I will discuss the basic philosophy underlying this software, I will present the results of experiments we have run on several datasets of respondent data in which we compare the accuracy of the software against the accuracy of human coders, and I will argue for a notion of "accuracy" defined in terms of inter-coder agreement rates. Finally, I will discuss the kind of characteristics that make a survey more or less amenable to automated coding by means of our system.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2008
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Survey coding
Customer satisfaction
Text classification
Sentiment analysis
Machine learning
Market research
			
	Appare nelle tipologie:
	
				04.04 Presentazione/Comunicazione non pubblicata (convegno, evento, webinar...)

File in questo prodotto:

File	Dimensione	Formato
prod_120647-doc_129628.pdf accesso aperto Descrizione: Machines that learn how to code open-ended survey data: underlying principles, experimental data, and methodological issues Tipologia: Versione Editoriale (PDF) Dimensione 2.78 MB Formato Adobe PDF Visualizza/Apri	2.78 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/85971

Citazioni

ND

ND

ND

social impact