In this paper, we present our approach to the task of binary sentiment classification for Italian reviews in healthcare domain. We first collected a new dataset for such domain. Then, we compared the results obtained by two different systems, one including a Support Vector Machine and one with BERT. For the first one, we linguistic pre-processed the dataset to extract hand-crafted features exploited by the classifier. For the second one, we oversampled the dataset to achieve better results. Our results show that the SVM-based system, without the worry of having to oversample, has better performance than the BERT-based one, achieving anF1-score of 91.21%.
A Machine Learning approach for Sentiment Analysis for Italian Reviews in Healthcare
Cimino A;Dell'Orletta F
2020
Abstract
In this paper, we present our approach to the task of binary sentiment classification for Italian reviews in healthcare domain. We first collected a new dataset for such domain. Then, we compared the results obtained by two different systems, one including a Support Vector Machine and one with BERT. For the first one, we linguistic pre-processed the dataset to extract hand-crafted features exploited by the classifier. For the second one, we oversampled the dataset to achieve better results. Our results show that the SVM-based system, without the worry of having to oversample, has better performance than the BERT-based one, achieving anF1-score of 91.21%.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Bacco L | it |
| dc.authority.people | Cimino A | it |
| dc.authority.people | Paulon L | it |
| dc.authority.people | Merone M | it |
| dc.authority.people | Dell'Orletta F | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/21 05:12:42 | - |
| dc.date.available | 2024/02/21 05:12:42 | - |
| dc.date.issued | 2020 | - |
| dc.description.abstracteng | In this paper, we present our approach to the task of binary sentiment classification for Italian reviews in healthcare domain. We first collected a new dataset for such domain. Then, we compared the results obtained by two different systems, one including a Support Vector Machine and one with BERT. For the first one, we linguistic pre-processed the dataset to extract hand-crafted features exploited by the classifier. For the second one, we oversampled the dataset to achieve better results. Our results show that the SVM-based system, without the worry of having to oversample, has better performance than the BERT-based one, achieving anF1-score of 91.21%. | - |
| dc.description.affiliations | Università Campus Bio-Medico (UCBM); Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC-CNR); Webmonks s.r.l.; Università Campus Bio-Medico (UCBM); Istituto di Linguistica Computazionale "Antonio Zampolli" (ILC-CNR); | - |
| dc.description.allpeople | Bacco, L; Cimino, A; Paulon, L; Merone, M; Dell'Orletta, F | - |
| dc.description.allpeopleoriginal | Bacco L., Cimino A., Paulon L., Merone M., Dell'Orletta F. | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 5 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/401373 | - |
| dc.language.iso | eng | - |
| dc.relation.conferencedate | 01-03/03/2021 | - |
| dc.relation.conferencename | Seventh Italian Conference on Computational Linguistics (CLiC-it 2020) | - |
| dc.relation.conferenceplace | online | - |
| dc.subject.keywords | natural language processing | - |
| dc.subject.keywords | sentiment analisys | - |
| dc.subject.singlekeyword | natural language processing | * |
| dc.subject.singlekeyword | sentiment analisys | * |
| dc.title | A Machine Learning approach for Sentiment Analysis for Italian Reviews in Healthcare | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 450786 | - |
| iris.orcid.lastModifiedDate | 2024/04/04 13:55:17 | * |
| iris.orcid.lastModifiedMillisecond | 1712231717535 | * |
| iris.scopus.extIssued | 2020 | - |
| iris.scopus.extTitle | A machine learning approach for sentiment analysis for Italian reviews in healthcare | - |
| iris.sitodocente.maxattempts | 2 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


