Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages and resources and focuses on the more challenging task of linking general-purpose language. We believe that our data will pave the way for further advances in alignment and evaluation of word senses by creating new solutions, particularly those notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA.
A multilingual evaluation dataset for monolingual word sense alignment
Monachini Monica;Bellandi Andrea;
2020
Abstract
Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages and resources and focuses on the more challenging task of linking general-purpose language. We believe that our data will pave the way for further advances in alignment and evaluation of word senses by creating new solutions, particularly those notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Ahmadi Sina | en |
| dc.authority.people | McCrae John P | en |
| dc.authority.people | Nimb Sanni | en |
| dc.authority.people | Khan Fahad | en |
| dc.authority.people | Monachini Monica | en |
| dc.authority.people | Pedersen Bolette S | en |
| dc.authority.people | Declerck Thierry | en |
| dc.authority.people | Wissik Tanja | en |
| dc.authority.people | Bellandi Andrea | en |
| dc.authority.people | Pisani Irene | en |
| dc.authority.people | TroelsgårdThomas | en |
| dc.authority.people | Olsen Sussi | en |
| dc.authority.people | Krek Simon | en |
| dc.authority.people | Lipp Veronika | en |
| dc.authority.people | VáradiTamás | en |
| dc.authority.people | Simon László | en |
| dc.authority.people | Gyorffy Andras | en |
| dc.authority.people | Tiberius Carole | en |
| dc.authority.people | Schoonheim Tanneke | en |
| dc.authority.people | Moshe Yifat Ben | en |
| dc.authority.people | Rudich Maya | en |
| dc.authority.people | Ahmad Raya Abu | en |
| dc.authority.people | Lonke Dorielle | en |
| dc.authority.people | Kovalenko Kira | en |
| dc.authority.people | Langemets Margit | en |
| dc.authority.people | Kallas Jelena | en |
| dc.authority.people | Oksana Dereza | en |
| dc.authority.people | FransenTheodorus | en |
| dc.authority.people | Cillessen David | en |
| dc.authority.people | Lindemann David | en |
| dc.authority.people | AlonsoMikel | en |
| dc.authority.people | Salgado Ana | en |
| dc.authority.people | Sancho Jose Luis | en |
| dc.authority.people | UrenaRuiz RafaelJ | en |
| dc.authority.people | Zamorano Jordi Porta | en |
| dc.authority.people | Simov Kiril | en |
| dc.authority.people | Osenova Petya | en |
| dc.authority.people | Kancheva Zara | en |
| dc.authority.people | Radev Ivaylo | en |
| dc.authority.people | Stankovic Ranka | en |
| dc.authority.people | PerdihAndrej | en |
| dc.authority.people | Gabrovsek Dejan | en |
| dc.authority.project | European Lexicographic Infrastructure | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.date.accessioned | 2024/02/19 10:13:21 | - |
| dc.date.available | 2024/02/19 10:13:21 | - |
| dc.date.firstsubmission | 2025/02/25 16:55:53 | * |
| dc.date.issued | 2020 | - |
| dc.date.submission | 2025/02/25 16:55:53 | * |
| dc.description.abstracteng | Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages and resources and focuses on the more challenging task of linking general-purpose language. We believe that our data will pave the way for further advances in alignment and evaluation of word senses by creating new solutions, particularly those notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA. | - |
| dc.description.affiliations | Insight Centre for Data Analytics, National University of Ireland, Galway, Society for Danish Language and Literature (DSL), Copenhagen, Denmark, Austrian Centre for Digital Humanities and Cultural Heritage, Austrian Academy of Sciences, Vienna, Austria, Istituto di Linguistica Computazionale "A. Zampolli- CNR", Pisa, Italy, Universita di Pisa, Italy, Jozef Stefan Institute, Ljubljana, Slovenia, Research Institute for Linguistics, Budapest, Hungary, Insight Centre for Data Analytics, National University of Ireland, Galway, Centre for Language Technology, University of Copenhagen, Denmark, Dutch Language Institute, Leiden, the Netherlands, K Dictionaries, Tel Aviv, Israel, Institute for Linguistic Studies of the Russian Academy of Sciences, St. Petersburg, Russia, DFKI GmbH, Multilinguality and Language Technology, Germany, Institute of the Estonian Language, Estonia, Euskal Herriko Unibertsitatea, Universidad del Pa´?s Vasco, | - |
| dc.description.allpeople | Ahmadi, Sina; McCrae John, P; Nimb, Sanni; Khan, Fahad; Monachini, Monica; Pedersen Bolette, S; Declerck, Thierry; Wissik, Tanja; Bellandi, Andrea; Pisani, Irene; Troelsgårdthomas, ; Olsen, Sussi; Krek, Simon; Lipp, Veronika; Váraditamás, ; Simon, László; Gyorffy, Andras; Tiberius, Carole; Schoonheim, Tanneke; Moshe Yifat, Ben; Rudich, Maya; Ahmad Raya, Abu; Lonke, Dorielle; Kovalenko, Kira; Langemets, Margit; Kallas, Jelena; Oksana, Dereza; Fransentheodorus, ; Cillessen, David; Lindemann, David; Alonsomikel, ; Salgado, Ana; Sancho Jose, Luis; Urenaruiz, Rafaelj; Zamorano Jordi, Porta; Simov, Kiril; Osenova, Petya; Kancheva, Zara; Radev, Ivaylo; Stankovic, Ranka; Perdihandrej, ; Gabrovsek, Dejan | - |
| dc.description.allpeopleoriginal | Ahmadi, Sina; McCrae, John P.; Nimb, Sanni; Khan, Fahad; Monachini, Monica; Pedersen, Bolette S.; Declerck, Thierry; Wissik, Tanja; Bellandi, Andrea; Pisani, Irene; Troelsgård, Thomas; Olsen, Sussi; Krek, Simon; Lipp, Veronika; Váradi, Tamás; Simon, László; Gyorffy, Andras; Tiberius, Carole; Schoonheim, Tanneke; Moshe, Yifat Ben; Rudich, Maya; Ahmad, Raya Abu; Lonke, Dorielle; Kovalenko, Kira; Langemets, Margit; Kallas, Jelena; Oksana, Dereza; Fransen, Theodorus; Cillessen, David; Lindemann, David; Alonso, Mikel; Salgado, Ana; Sancho, Jose Luis; Urena-Ruiz, RafaelJ.; Zamorano, Jordi Porta; Simov, Kiril; Osenova, Petya; Kancheva, Zara; Radev, Ivaylo; Stankovic, Ranka; Perdih, Andrej; Gabrovsek, Dejan | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 42 | - |
| dc.identifier.isbn | 979-10-95546-34-4 | en |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/404924 | - |
| dc.language.iso | eng | en |
| dc.relation.conferencedate | 11-16/05/2020 | en |
| dc.relation.conferencename | Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) | en |
| dc.relation.ispartofbook | Proceedings of the 12th Language Resources and Evaluation Conference - LREC 2020 | en |
| dc.relation.projectAcronym | ELEXIS | en |
| dc.relation.projectAwardNumber | 731015 | en |
| dc.relation.projectAwardTitle | European Lexicographic Infrastructure | en |
| dc.relation.projectFunderName | - | en |
| dc.relation.projectFundingStream | H2020 | en |
| dc.subject.keywords | lexical semantic resources | - |
| dc.subject.keywords | sense alignment | - |
| dc.subject.keywords | lexicography | - |
| dc.subject.keywords | language resource | - |
| dc.subject.singlekeyword | lexical semantic resources | * |
| dc.subject.singlekeyword | sense alignment | * |
| dc.subject.singlekeyword | lexicography | * |
| dc.subject.singlekeyword | language resource | * |
| dc.title | A multilingual evaluation dataset for monolingual word sense alignment | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | en |
| dc.ugov.descaux1 | 429354 | - |
| iris.mediafilter.data | 2025/04/06 03:08:30 | * |
| iris.orcid.lastModifiedDate | 2025/02/25 16:56:57 | * |
| iris.orcid.lastModifiedMillisecond | 1740499017194 | * |
| iris.scopus.extIssued | 2020 | - |
| iris.scopus.extTitle | A multilingual evaluation dataset for monolingual word sense alignment | - |
| iris.sitodocente.maxattempts | 2 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_429354-doc_156902.pdf
accesso aperto
Descrizione: LREC2020_WSalignment
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
685 kB
Formato
Adobe PDF
|
685 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


