In this paper, we present an evaluation of the influence of label selection on the performance of a Sequence-to-Sequence Transformer model in a classification task. Our study investigates whether the choice of words used to represent classification categories affects the model’s performance, and if there exists a relationship between the model’s performance and the selected words. To achieve this, we fine-tuned an Italian T5 model on topic classification using various labels. Our results indicate that the different label choices can significantly impact the model’s performance. That being said, we did not find a clear answer on how these choices affect the model performances, highlighting the need for further research in optimizing label selection.
Lost in Labels: An Ongoing Quest to Optimize Text-to-Text Label Selection for Classification
Miaschi Alessio;Michele Papucci;Felice Dell'Orletta
2023
Abstract
In this paper, we present an evaluation of the influence of label selection on the performance of a Sequence-to-Sequence Transformer model in a classification task. Our study investigates whether the choice of words used to represent classification categories affects the model’s performance, and if there exists a relationship between the model’s performance and the selected words. To achieve this, we fine-tuned an Italian T5 model on topic classification using various labels. Our results indicate that the different label choices can significantly impact the model’s performance. That being said, we did not find a clear answer on how these choices affect the model performances, highlighting the need for further research in optimizing label selection.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Miaschi Alessio | en |
| dc.authority.people | Michele Papucci | en |
| dc.authority.people | Felice Dell'Orletta | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/12/20 12:14:16 | - |
| dc.date.available | 2024/12/20 12:14:16 | - |
| dc.date.firstsubmission | 2024/12/20 10:01:31 | * |
| dc.date.issued | 2023 | - |
| dc.date.submission | 2024/12/20 10:01:31 | * |
| dc.description.abstracteng | In this paper, we present an evaluation of the influence of label selection on the performance of a Sequence-to-Sequence Transformer model in a classification task. Our study investigates whether the choice of words used to represent classification categories affects the model’s performance, and if there exists a relationship between the model’s performance and the selected words. To achieve this, we fine-tuned an Italian T5 model on topic classification using various labels. Our results indicate that the different label choices can significantly impact the model’s performance. That being said, we did not find a clear answer on how these choices affect the model performances, highlighting the need for further research in optimizing label selection. | - |
| dc.description.allpeople | Miaschi, Alessio; Papucci, Michele; Dell'Orletta, Felice | - |
| dc.description.allpeopleoriginal | Miaschi Alessio, Michele Papucci, Felice Dell'Orletta | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 3 | - |
| dc.identifier.source | bibtex | * |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/520527 | - |
| dc.language.iso | eng | en |
| dc.relation.ispartofbook | Proceedings of the 9th Italian Conference on Computational Linguistics CLiC-it 2023: Venice, Italy, November 30-December 2, 2023 | en |
| dc.relation.issue | 394 | en |
| dc.relation.volume | 516 | en |
| dc.subject.keywordseng | encoder-decoder, label selection, topic classification | - |
| dc.subject.singlekeyword | encoder-decoder | * |
| dc.subject.singlekeyword | label selection | * |
| dc.subject.singlekeyword | topic classification | * |
| dc.title | Lost in Labels: An Ongoing Quest to Optimize Text-to-Text Label Selection for Classification | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| iris.mediafilter.data | 2025/04/15 04:27:13 | * |
| iris.orcid.lastModifiedDate | 2024/12/20 12:14:16 | * |
| iris.orcid.lastModifiedMillisecond | 1734693256333 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
paper39.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
1.62 MB
Formato
Adobe PDF
|
1.62 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


