CNR Institutional Research Information System

The paper describes how Egg, a large UK internet bank, in partnership with meaning ltd, a research technology consultancy, commissioned the Institute for the Science and Technology of Information of the Italian National Council of Research (ISTI-CNR) to build VCS, a novel software solution for the automatic analysis of the many thousands of verbatim comments Egg collects through event and customer experience surveys conducted online. The sheer volume of responses received had made any systematic analysis of them impossible, which is problem experienced not only by Egg but by many who conduct research online. These difficulties mean that researchers often restrict research designs to reduce or eliminate open questions from online surveys, despite the richness of the insights that can be obtained in this way. Unlike other automated or computer-assisted software, the system developed for Egg is novel in that it applies machine learning to the problem of analysing and classifying verbatim texts from open-ended questions. With few precedents for the application of this method to verbatim textual data, which is characterised not only by great diversity of content but by frequent linguistic mistakes, a cautious experimental approach was taken in order to validate the software's accuracy and reliability in performing the task. The results of these experiments, presented in this paper, show that the software performs well against human coders that have subject matter expertise. Overall accuracy using a range of statistical measures was broadly similar. Furthermore, it was shown that accuracy could be improved by increasing the number of training examples, thereby demonstrating that the system is inherently trainable, and its performance can be monitored and improved with respect each survey or question being analysed. Practical experiences described here, since the software was adopted at Egg, demonstrate dramatic savings in cost and time, making comprehensive and highly systematic analysis of many thousands of verbatim responses cost effective. It concludes by considering the implications of this technology for all engaged in research or evaluating customer feedback, and how its application could profoundly change the way research is undertaken, by effectively removing restrictions on placing open-ended questions in online surveys.

Cracking the code: what customers say, in their own words

Macer T;Pearson M;Sebastiani F

2007

Abstract

The paper describes how Egg, a large UK internet bank, in partnership with meaning ltd, a research technology consultancy, commissioned the Institute for the Science and Technology of Information of the Italian National Council of Research (ISTI-CNR) to build VCS, a novel software solution for the automatic analysis of the many thousands of verbatim comments Egg collects through event and customer experience surveys conducted online. The sheer volume of responses received had made any systematic analysis of them impossible, which is problem experienced not only by Egg but by many who conduct research online. These difficulties mean that researchers often restrict research designs to reduce or eliminate open questions from online surveys, despite the richness of the insights that can be obtained in this way. Unlike other automated or computer-assisted software, the system developed for Egg is novel in that it applies machine learning to the problem of analysing and classifying verbatim texts from open-ended questions. With few precedents for the application of this method to verbatim textual data, which is characterised not only by great diversity of content but by frequent linguistic mistakes, a cautious experimental approach was taken in order to validate the software's accuracy and reliability in performing the task. The results of these experiments, presented in this paper, show that the software performs well against human coders that have subject matter expertise. Overall accuracy using a range of statistical measures was broadly similar. Furthermore, it was shown that accuracy could be improved by increasing the number of training examples, thereby demonstrating that the system is inherently trainable, and its performance can be monitored and improved with respect each survey or question being analysed. Practical experiences described here, since the software was adopted at Egg, demonstrate dramatic savings in cost and time, making comprehensive and highly systematic analysis of many thousands of verbatim responses cost effective. It concludes by considering the implications of this technology for all engaged in research or evaluating customer feedback, and how its application could profoundly change the way research is undertaken, by effectively removing restrictions on placing open-ended questions in online surveys.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2007
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Survey coding
Text classification
Opinion mining
Verbatim coding
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_91754-doc_131267.pdf solo utenti autorizzati Descrizione: Cracking the code: what customers say, in their own words Tipologia: Versione Editoriale (PDF) Dimensione 1.91 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.91 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/57647

Citazioni

ND

ND

ND

social impact