Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs{'} ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages.

MultiLexBATS: Multilingual Dataset of Lexical Semantic Relations

Francesca Frontini;Fahad Khan;
2024

Abstract

Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs{'} ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Dagmar Gromann en
dc.authority.people Hugo Goncalo Oliveira en
dc.authority.people Lucia Pitarch en
dc.authority.people Elena-Simona Apostol en
dc.authority.people Jordi Bernad en
dc.authority.people Eliot Bytyçi en
dc.authority.people Chiara Cantone en
dc.authority.people Sara Carvalho en
dc.authority.people Francesca Frontini en
dc.authority.people Radovan Garabik en
dc.authority.people Jorge Gracia en
dc.authority.people Letizia Granata en
dc.authority.people Fahad Khan en
dc.authority.people Timotej Knez en
dc.authority.people Penny Labropoulou en
dc.authority.people Chaya Liebeskind en
dc.authority.people Maria Pia Di Buono en
dc.authority.people Ana Ostroški Anić en
dc.authority.people Sigita Rackevičienė en
dc.authority.people Ricardo Rodrigues en
dc.authority.people Gilles Sérasset en
dc.authority.people Linas Selmistraitis en
dc.authority.people Mahammadou Sidibé en
dc.authority.people Purificação Silvano en
dc.authority.people Blerina Spahiu en
dc.authority.people Enriketa Sogutlu en
dc.authority.people Ranka Stanković en
dc.authority.people Ciprian-Octavian Truică en
dc.authority.people Giedre Valunaite Oleskeviciene en
dc.authority.people Slavko Zitnik en
dc.authority.people Katerina Zdravkova en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/08/05 16:46:44 -
dc.date.available 2024/08/05 16:46:44 -
dc.date.firstsubmission 2024/06/16 11:59:44 *
dc.date.issued 2024 -
dc.date.submission 2024/06/16 11:59:44 *
dc.description.abstracteng Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs{'} ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages. -
dc.description.allpeople Gromann, Dagmar; Goncalo Oliveira, Hugo; Pitarch, Lucia; Apostol, Elena-Simona; Bernad, Jordi; Bytyçi, Eliot; Cantone, Chiara; Carvalho, Sara; Frontini, Francesca; Garabik, Radovan; Gracia, Jorge; Granata, Letizia; Khan, Fahad; Knez, Timotej; Labropoulou, Penny; Liebeskind, Chaya; Pia Di Buono, Maria; Ostroški Anić, Ana; Rackevičienė, Sigita; Rodrigues, Ricardo; Sérasset, Gilles; Selmistraitis, Linas; Sidibé, Mahammadou; Silvano, Purificação; Spahiu, Blerina; Sogutlu, Enriketa; Stanković, Ranka; Truică, Ciprian-Octavian; Valunaite Oleskeviciene, Giedre; Zitnik, Slavko; Zdravkova, Katerina -
dc.description.allpeopleoriginal Dagmar Gromann, Hugo Goncalo Oliveira, Lucia Pitarch, Elena-Simona Apostol, Jordi Bernad, Eliot Bytyçi, Chiara Cantone, Sara Carvalho, Francesca Frontini, Radovan Garabik, Jorge Gracia, Letizia Granata, Fahad Khan, Timotej Knez, Penny Labropoulou, Chaya Liebeskind, Maria Pia Di Buono, Ana Ostroški Anić, Sigita Rackevičienė, Ricardo Rodrigues, Gilles Sérasset, Linas Selmistraitis, Mahammadou Sidibé, Purificação Silvano, Blerina Spahiu, Enriketa Sogutlu, Ranka Stanković, Ciprian-Octavian Truică, Giedre Valunaite Oleskeviciene, Slavko Zitnik, Katerina Zdravkova en
dc.description.fulltext open en
dc.description.numberofauthors 31 -
dc.identifier.source bibtex *
dc.identifier.uri https://hdl.handle.net/20.500.14243/475921 -
dc.identifier.url https://aclanthology.org/2024.lrec-main.1029 en
dc.language.iso eng en
dc.publisher.name ELRA and ICCL en
dc.relation.allauthors Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen en
dc.relation.conferencedate 20-24/05/2024 en
dc.relation.conferenceplace Torino en
dc.relation.firstpage 11783 en
dc.relation.ispartofbook Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) en
dc.relation.lastpage 11793 en
dc.relation.numberofpages 11 en
dc.subject.keywordseng Lexical Semantic Relations -
dc.subject.keywordseng Multilingual Benchmark -
dc.subject.keywordseng BATS -
dc.subject.singlekeyword Lexical Semantic Relations *
dc.subject.singlekeyword Multilingual Benchmark *
dc.subject.singlekeyword BATS *
dc.title MultiLexBATS: Multilingual Dataset of Lexical Semantic Relations en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
iris.mediafilter.data 2025/04/16 03:57:44 *
iris.orcid.lastModifiedDate 2024/12/06 18:43:06 *
iris.orcid.lastModifiedMillisecond 1733506986723 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
2024.lrec-main.1029.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 316.97 kB
Formato Adobe PDF
316.97 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/475921
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact