Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications.
Modelling Collocations in OntoLex-FrAC
2022
Abstract
Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.people | Christian Chiarcos | it |
| dc.authority.people | Katerina Gkirtzou | it |
| dc.authority.people | Maxim Ionov | it |
| dc.authority.people | Besim Kabashi | it |
| dc.authority.people | Fahad Khan | it |
| dc.authority.people | CiprianOctavian Truic | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/20 20:49:53 | - |
| dc.date.available | 2024/02/20 20:49:53 | - |
| dc.date.issued | 2022 | - |
| dc.description.abstracteng | Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC. We provide an RDF vocabulary for collocations, established as a consensus over contributions from five different institutions and numerous data sets, with the goal of eliciting feedback from reviewers, workshop audience and the scientific community in preparation of the final consolidation of the OntoLex-FrAC module, whose publication as a W3C community report is foreseen for the end of this year. The novel collocation component of OntoLex-FrAC is described in application to a lexicographic resource and corpus-based collocation scores available from the web, and finally, we demonstrate the capability and genericity of the model by showing how to retrieve and aggregate collocation information by means of SPARQL, and its export to a tabular format, so that it can be easily processed in downstream applications. | - |
| dc.description.affiliations | Applied Computational Linguistics, Goethe University Frankfurt, Frankfurt am Main, Germany, Institute for Digital Humanities, University of Cologne, Germany, Institute of Language and Speech Processing, Athena Research Center, Athens, Greece, Computational and Corpus Linguistics, Friedrich-Alexander University of Erlangen-Nuremberg, Germany, Istituto di Linguistica Computazionale A. Zampolli, Consiglio Nazionale delle Ricerche, Italy, Department of Information Technology, Uppsala University, Sweden | - |
| dc.description.allpeople | Christian Chiarcos; Katerina Gkirtzou; Maxim Ionov; Besim Kabashi; Fahad Khan; CiprianOctavian Truic | - |
| dc.description.allpeopleoriginal | Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truic? | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 1 | - |
| dc.identifier.isbn | 979-10-95546-92-4 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/444084 | - |
| dc.identifier.url | https://aclanthology.org/2022.gwll-1.3.pdf | - |
| dc.language.iso | eng | - |
| dc.relation.conferencedate | 20/06/2022 | - |
| dc.relation.conferencename | Proceedings of the Globalex Workshop on Linked Lexicography @LREC2022 | - |
| dc.subject.keywords | lexical resources | - |
| dc.subject.keywords | standards | - |
| dc.subject.keywords | OntoLex | - |
| dc.subject.keywords | collocation analysis | - |
| dc.subject.singlekeyword | lexical resources | * |
| dc.subject.singlekeyword | standards | * |
| dc.subject.singlekeyword | OntoLex | * |
| dc.subject.singlekeyword | collocation analysis | * |
| dc.title | Modelling Collocations in OntoLex-FrAC | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.ugov.descaux1 | 472136 | - |
| iris.orcid.lastModifiedDate | 2024/03/02 05:20:28 | * |
| iris.orcid.lastModifiedMillisecond | 1709353228563 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


