In this paper we present a completely unsupervised approach for creating a sentiment lexicon. The approach has been realized by designing a pipeline which implements an unsupervised system that covers different aspects: the automatic extraction of user reviews, the pre-processing of text, the use of a scoring measure which combines: entropy, term frequency, inverse document frequency, and finally a cross lingual intersection. We have validated the approach though the analysis of app reviews present in the Google Play market. The results show the effectiveness of the approach given by satisfactory values of precision for the obtained lexicon.
An unsupervised data-driven cross-lingual method for building high precision sentiment lexicons
Augello Agnese;Pilato Giovanni
2013
Abstract
In this paper we present a completely unsupervised approach for creating a sentiment lexicon. The approach has been realized by designing a pipeline which implements an unsupervised system that covers different aspects: the automatic extraction of user reviews, the pre-processing of text, the use of a scoring measure which combines: entropy, term frequency, inverse document frequency, and finally a cross lingual intersection. We have validated the approach though the analysis of app reviews present in the Google Play market. The results show the effectiveness of the approach given by satisfactory values of precision for the obtained lexicon.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


