The present document illustrates the work carried out in task 3.3 (work package 3) focused on lexicalsemantic analytics for Natural Language Processing (NLP). This task aims at computing analytics for lexicalsemantic information such as words, senses and domains in the available resources, investigating their role in NLP applications. Specifically, this task concentrates on three research directions, namely i) which grouping senses based on their semantic similari sense clustering , in ty improves the performance of NLP tasks such as Word Sense Disambiguation (WSD), ii) domain labeling of text , in which the lexicographic resources made available by the ELEXIS project for research purposes allow better performances to be achieved, and fin senses ally iii) analysing the , for which a software package is made available. diachronic distribution of In this deliverable, we illustrate the research activities aimed at achieving the aforementioned goals and put forward suggestions for future works. Importantly, we stress the crucial role played by highquality lexicalsemantic r esources when investigating such linguistic aspects and their impact on NLP applications. To this end, as an additional contribution, we address the paucity of manually the ELEXIS parallelannotated data in the lexical senseannotated datasetsemantic research field and introduce , a novel entirely manuallyavailable in 10 European languages and featuring 5 annotation layers.
D3. 8 Lexical-semantic analytics for NLP
Francesca Frontini;Valeria Quochi;
2022
Abstract
The present document illustrates the work carried out in task 3.3 (work package 3) focused on lexicalsemantic analytics for Natural Language Processing (NLP). This task aims at computing analytics for lexicalsemantic information such as words, senses and domains in the available resources, investigating their role in NLP applications. Specifically, this task concentrates on three research directions, namely i) which grouping senses based on their semantic similari sense clustering , in ty improves the performance of NLP tasks such as Word Sense Disambiguation (WSD), ii) domain labeling of text , in which the lexicographic resources made available by the ELEXIS project for research purposes allow better performances to be achieved, and fin senses ally iii) analysing the , for which a software package is made available. diachronic distribution of In this deliverable, we illustrate the research activities aimed at achieving the aforementioned goals and put forward suggestions for future works. Importantly, we stress the crucial role played by highquality lexicalsemantic r esources when investigating such linguistic aspects and their impact on NLP applications. To this end, as an additional contribution, we address the paucity of manually the ELEXIS parallelannotated data in the lexical senseannotated datasetsemantic research field and introduce , a novel entirely manuallyavailable in 10 European languages and featuring 5 annotation layers.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Federico Martelli | it |
| dc.authority.people | Marco Maru | it |
| dc.authority.people | Cesare Campagnano | it |
| dc.authority.people | Roberto Navigli | it |
| dc.authority.people | Paola Velardi | it |
| dc.authority.people | RafaelJ UreñaRuiz | it |
| dc.authority.people | Francesca Frontini | it |
| dc.authority.people | Valeria Quochi | it |
| dc.authority.people | Jelena Kallas | it |
| dc.authority.people | Kristina Koppel | it |
| dc.authority.people | Margit Langemets | it |
| dc.authority.people | Jesse de Does | it |
| dc.authority.people | Rob Tempelaars | it |
| dc.authority.people | Carole Tiberius | it |
| dc.authority.people | Rute Costa | it |
| dc.authority.people | Ana Salgado | it |
| dc.authority.people | Simon Krek | it |
| dc.authority.people | Jaka ibej | it |
| dc.authority.people | Kaja Dobrovoljc | it |
| dc.authority.people | Polona Gantar | it |
| dc.authority.people | Tina Munda | it |
| dc.authority.project | European Lexicographic Infrastructure | - |
| dc.collection.id.s | 0a8868f5-ed00-4649-854b-d9a6fd0a38b2 | * |
| dc.collection.name | 08.01 Rapporto di progetto | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/20 16:09:14 | - |
| dc.date.available | 2024/02/20 16:09:14 | - |
| dc.date.issued | 2022 | - |
| dc.description.abstracteng | The present document illustrates the work carried out in task 3.3 (work package 3) focused on lexicalsemantic analytics for Natural Language Processing (NLP). This task aims at computing analytics for lexicalsemantic information such as words, senses and domains in the available resources, investigating their role in NLP applications. Specifically, this task concentrates on three research directions, namely i) which grouping senses based on their semantic similari sense clustering , in ty improves the performance of NLP tasks such as Word Sense Disambiguation (WSD), ii) domain labeling of text , in which the lexicographic resources made available by the ELEXIS project for research purposes allow better performances to be achieved, and fin senses ally iii) analysing the , for which a software package is made available. diachronic distribution of In this deliverable, we illustrate the research activities aimed at achieving the aforementioned goals and put forward suggestions for future works. Importantly, we stress the crucial role played by highquality lexicalsemantic r esources when investigating such linguistic aspects and their impact on NLP applications. To this end, as an additional contribution, we address the paucity of manually the ELEXIS parallelannotated data in the lexical senseannotated datasetsemantic research field and introduce , a novel entirely manuallyavailable in 10 European languages and featuring 5 annotation layers. | - |
| dc.description.affiliations | Unversità La sapienza, Roma; CNR-ILC; EKI; INT; NOVA CLUNL; JSI | - |
| dc.description.allpeople | Martelli, Federico; Maru, Marco; Campagnano, Cesare; Navigli, Roberto; Velardi, Paola; Ureñaruiz, Rafaelj; Frontini, Francesca; Quochi, Valeria; Kallas, Jelena; Koppel, Kristina; Langemets, Margit; de Does, Jesse; Tempelaars, Rob; Tiberius, Carole; Costa, Rute; Salgado, Ana; Krek, Simon; Ibej, Jaka; Dobrovoljc, Kaja; Gantar, Polona; Munda, Tina | - |
| dc.description.allpeopleoriginal | Federico Martelli, Marco Maru, Cesare Campagnano, Roberto Navigli, Paola Velardi, Rafael-J Ureña-Ruiz, Francesca Frontini, Valeria Quochi, Jelena Kallas, Kristina Koppel, Margit Langemets, Jesse de Does, Rob Tempelaars, Carole Tiberius, Rute Costa, Ana Salgado, Simon Krek, Jaka ?ibej, Kaja Dobrovoljc, Polona Gantar, Tina Munda | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 21 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/412365 | - |
| dc.identifier.url | https://elex.is/wp-content/uploads/ELEXIS_D3_8_Lexical-Semantic_Analytics_for_NLP_final_report.pdf | - |
| dc.language.iso | eng | - |
| dc.relation.numberofpages | 67 | - |
| dc.relation.projectAcronym | ELEXIS | - |
| dc.relation.projectAwardNumber | 731015 | - |
| dc.relation.projectAwardTitle | European Lexicographic Infrastructure | - |
| dc.relation.projectFunderName | - | en |
| dc.relation.projectFundingStream | H2020 | - |
| dc.subject.keywords | research infrastructures | - |
| dc.subject.keywords | lexicography | - |
| dc.subject.keywords | lexical resources | - |
| dc.subject.keywords | word-sense disambiguation | - |
| dc.subject.keywords | WSD | - |
| dc.subject.keywords | sense-annotated language data | - |
| dc.subject.keywords | multilinguality | - |
| dc.subject.singlekeyword | research infrastructures | * |
| dc.subject.singlekeyword | lexicography | * |
| dc.subject.singlekeyword | lexical resources | * |
| dc.subject.singlekeyword | word-sense disambiguation | * |
| dc.subject.singlekeyword | WSD | * |
| dc.subject.singlekeyword | sense-annotated language data | * |
| dc.subject.singlekeyword | multilinguality | * |
| dc.title | D3. 8 Lexical-semantic analytics for NLP | en |
| dc.type.driver | info:eu-repo/semantics/other | - |
| dc.type.full | 08 Report e Working Paper::08.01 Rapporto di progetto | it |
| dc.type.miur | -2.0 | - |
| dc.ugov.classaux1 | Rapporto finale di progetto | - |
| dc.ugov.descaux1 | 472421 | - |
| iris.orcid.lastModifiedDate | 2024/04/04 14:09:18 | * |
| iris.orcid.lastModifiedMillisecond | 1712232558539 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 08.01 Rapporto di progetto | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


