Recent evolutions in the e-commerce market have led to an increasing importance attributed by consumers to product reviews made by third parties before proceeding to purchase. The industry, in order to improve the offer intercepting the discontent of consumers, has placed increasing attention towards systems able to identify the sentiment expressed by buyers, whether positive or negative. From a technological point of view, the literature in recent years has seen the development of two types of methodologies: those based on lexicons and those based on machine and deep learning techniques. This study proposes a comparison between these technologies in the Italian market, one of the largest in the world, exploiting an ad hoc dataset: scientific evidence generally shows the superiority of language models such as BERT built on deep neural networks, but it opens several considerations on the effectiveness and improvement of these solutions when compared to those based on lexicons in the presence of datasets of reduced size such as the one under study, a common condition for languages other than English or Chinese.

Lexicon-Based vs. Bert-Based Sentiment Analysis: A Comparative Study in Italian

Catelli R;Pelosi S;Esposito M
2022

Abstract

Recent evolutions in the e-commerce market have led to an increasing importance attributed by consumers to product reviews made by third parties before proceeding to purchase. The industry, in order to improve the offer intercepting the discontent of consumers, has placed increasing attention towards systems able to identify the sentiment expressed by buyers, whether positive or negative. From a technological point of view, the literature in recent years has seen the development of two types of methodologies: those based on lexicons and those based on machine and deep learning techniques. This study proposes a comparison between these technologies in the Italian market, one of the largest in the world, exploiting an ad hoc dataset: scientific evidence generally shows the superiority of language models such as BERT built on deep neural networks, but it opens several considerations on the effectiveness and improvement of these solutions when compared to those based on lexicons in the presence of datasets of reduced size such as the one under study, a common condition for languages other than English or Chinese.
2022
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
sentiment analysis
review
Italian
lexicon
nooj
deep learning
BERT
File in questo prodotto:
File Dimensione Formato  
prod_471311-doc_191366.pdf

solo utenti autorizzati

Descrizione: Lexicon-Based vs. Bert-Based Sentiment Analysis: A Comparative Study in Italian
Tipologia: Versione Editoriale (PDF)
Licenza: Nessuna licenza dichiarata (non attribuibile a prodotti successivi al 2023)
Dimensione 344.82 kB
Formato Adobe PDF
344.82 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/417086
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 97
  • ???jsp.display-item.citation.isi??? ND
social impact