Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications.

Modelling Collocations in OntoLex-FrAC

2022

Abstract

Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications.
Campo DC Valore Lingua
dc.authority.people Christian Chiarcos it
dc.authority.people Katerina Gkirtzou it
dc.authority.people Maxim Ionov it
dc.authority.people Besim Kabashi it
dc.authority.people Fahad Khan it
dc.authority.people CiprianOctavian Truic it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/20 20:49:53 -
dc.date.available 2024/02/20 20:49:53 -
dc.date.issued 2022 -
dc.description.abstracteng Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications. -
dc.description.affiliations Applied Computational Linguistics, Goethe University Frankfurt, Frankfurt am Main, Germany, Institute for Digital Humanities, University of Cologne, Germany, Institute of Language and Speech Processing, Athena Research Center, Athens, Greece, Computational and Corpus Linguistics, Friedrich-Alexander University of Erlangen-Nuremberg, Germany, Istituto di Linguistica Computazionale A. Zampolli, Consiglio Nazionale delle Ricerche, Italy, Department of Information Technology, Uppsala University, Sweden -
dc.description.allpeople Christian Chiarcos; Katerina Gkirtzou; Maxim Ionov; Besim Kabashi; Fahad Khan; CiprianOctavian Truic -
dc.description.allpeopleoriginal Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truic? -
dc.description.fulltext none en
dc.description.numberofauthors 1 -
dc.identifier.isbn 979-10-95546-92-4 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/444084 -
dc.identifier.url https://aclanthology.org/2022.gwll-1.3.pdf -
dc.language.iso eng -
dc.relation.conferencedate 20/06/2022 -
dc.relation.conferencename Proceedings of the Globalex Workshop on Linked Lexicography @LREC2022 -
dc.subject.keywords lexical resources -
dc.subject.keywords standards -
dc.subject.keywords OntoLex -
dc.subject.keywords collocation analysis -
dc.subject.singlekeyword lexical resources *
dc.subject.singlekeyword standards *
dc.subject.singlekeyword OntoLex *
dc.subject.singlekeyword collocation analysis *
dc.title Modelling Collocations in OntoLex-FrAC en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.ugov.descaux1 472136 -
iris.orcid.lastModifiedDate 2024/03/02 05:20:28 *
iris.orcid.lastModifiedMillisecond 1709353228563 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/444084
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact