Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs{'} ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages.
MultiLexBATS: Multilingual Dataset of Lexical Semantic Relations
Francesca Frontini;Fahad Khan;
2024
Abstract
Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs{'} ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Dagmar Gromann | en |
| dc.authority.people | Hugo Goncalo Oliveira | en |
| dc.authority.people | Lucia Pitarch | en |
| dc.authority.people | Elena-Simona Apostol | en |
| dc.authority.people | Jordi Bernad | en |
| dc.authority.people | Eliot Bytyçi | en |
| dc.authority.people | Chiara Cantone | en |
| dc.authority.people | Sara Carvalho | en |
| dc.authority.people | Francesca Frontini | en |
| dc.authority.people | Radovan Garabik | en |
| dc.authority.people | Jorge Gracia | en |
| dc.authority.people | Letizia Granata | en |
| dc.authority.people | Fahad Khan | en |
| dc.authority.people | Timotej Knez | en |
| dc.authority.people | Penny Labropoulou | en |
| dc.authority.people | Chaya Liebeskind | en |
| dc.authority.people | Maria Pia Di Buono | en |
| dc.authority.people | Ana Ostroški Anić | en |
| dc.authority.people | Sigita Rackevičienė | en |
| dc.authority.people | Ricardo Rodrigues | en |
| dc.authority.people | Gilles Sérasset | en |
| dc.authority.people | Linas Selmistraitis | en |
| dc.authority.people | Mahammadou Sidibé | en |
| dc.authority.people | Purificação Silvano | en |
| dc.authority.people | Blerina Spahiu | en |
| dc.authority.people | Enriketa Sogutlu | en |
| dc.authority.people | Ranka Stanković | en |
| dc.authority.people | Ciprian-Octavian Truică | en |
| dc.authority.people | Giedre Valunaite Oleskeviciene | en |
| dc.authority.people | Slavko Zitnik | en |
| dc.authority.people | Katerina Zdravkova | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/08/05 16:46:44 | - |
| dc.date.available | 2024/08/05 16:46:44 | - |
| dc.date.firstsubmission | 2024/06/16 11:59:44 | * |
| dc.date.issued | 2024 | - |
| dc.date.submission | 2024/06/16 11:59:44 | * |
| dc.description.abstracteng | Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs{'} ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages. | - |
| dc.description.allpeople | Gromann, Dagmar; Goncalo Oliveira, Hugo; Pitarch, Lucia; Apostol, Elena-Simona; Bernad, Jordi; Bytyçi, Eliot; Cantone, Chiara; Carvalho, Sara; Frontini, Francesca; Garabik, Radovan; Gracia, Jorge; Granata, Letizia; Khan, Fahad; Knez, Timotej; Labropoulou, Penny; Liebeskind, Chaya; Pia Di Buono, Maria; Ostroški Anić, Ana; Rackevičienė, Sigita; Rodrigues, Ricardo; Sérasset, Gilles; Selmistraitis, Linas; Sidibé, Mahammadou; Silvano, Purificação; Spahiu, Blerina; Sogutlu, Enriketa; Stanković, Ranka; Truică, Ciprian-Octavian; Valunaite Oleskeviciene, Giedre; Zitnik, Slavko; Zdravkova, Katerina | - |
| dc.description.allpeopleoriginal | Dagmar Gromann, Hugo Goncalo Oliveira, Lucia Pitarch, Elena-Simona Apostol, Jordi Bernad, Eliot Bytyçi, Chiara Cantone, Sara Carvalho, Francesca Frontini, Radovan Garabik, Jorge Gracia, Letizia Granata, Fahad Khan, Timotej Knez, Penny Labropoulou, Chaya Liebeskind, Maria Pia Di Buono, Ana Ostroški Anić, Sigita Rackevičienė, Ricardo Rodrigues, Gilles Sérasset, Linas Selmistraitis, Mahammadou Sidibé, Purificação Silvano, Blerina Spahiu, Enriketa Sogutlu, Ranka Stanković, Ciprian-Octavian Truică, Giedre Valunaite Oleskeviciene, Slavko Zitnik, Katerina Zdravkova | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 31 | - |
| dc.identifier.source | bibtex | * |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/475921 | - |
| dc.identifier.url | https://aclanthology.org/2024.lrec-main.1029 | en |
| dc.language.iso | eng | en |
| dc.publisher.name | ELRA and ICCL | en |
| dc.relation.allauthors | Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen | en |
| dc.relation.conferencedate | 20-24/05/2024 | en |
| dc.relation.conferenceplace | Torino | en |
| dc.relation.firstpage | 11783 | en |
| dc.relation.ispartofbook | Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) | en |
| dc.relation.lastpage | 11793 | en |
| dc.relation.numberofpages | 11 | en |
| dc.subject.keywordseng | Lexical Semantic Relations | - |
| dc.subject.keywordseng | Multilingual Benchmark | - |
| dc.subject.keywordseng | BATS | - |
| dc.subject.singlekeyword | Lexical Semantic Relations | * |
| dc.subject.singlekeyword | Multilingual Benchmark | * |
| dc.subject.singlekeyword | BATS | * |
| dc.title | MultiLexBATS: Multilingual Dataset of Lexical Semantic Relations | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| iris.mediafilter.data | 2025/04/16 03:57:44 | * |
| iris.orcid.lastModifiedDate | 2024/12/06 18:43:06 | * |
| iris.orcid.lastModifiedMillisecond | 1733506986723 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
2024.lrec-main.1029.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
316.97 kB
Formato
Adobe PDF
|
316.97 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


