The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications

L-LEME: an Automatic Lexical Merger based on the LMF Standard

Riccardo Del Gratta;Francesca Frontini;Monica Monachini;Valeria Quochi;Matteo Abrate;Angelica Lo Duca
2012

Abstract

The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications
Campo DC Valore Lingua
dc.authority.orgunit Istituto di informatica e telematica - IIT -
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Riccardo Del Gratta it
dc.authority.people Francesca Frontini it
dc.authority.people Monica Monachini it
dc.authority.people Valeria Quochi it
dc.authority.people Francesco Rubino it
dc.authority.people Matteo Abrate it
dc.authority.people Angelica Lo Duca it
dc.authority.project Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies -
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di informatica e telematica - IIT *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 912 *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/15 19:19:15 -
dc.date.available 2024/02/15 19:19:15 -
dc.date.issued 2012 -
dc.description.abstracteng The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications -
dc.description.affiliations CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-IIT, Pisa, Italy; CNR-IIT, Pisa, Italy -
dc.description.allpeople DEL GRATTA, Riccardo; Frontini, Francesca; Monachini, Monica; Quochi, Valeria; Rubino, Francesco; Abrate, Matteo; LO DUCA, Angelica -
dc.description.allpeopleoriginal Riccardo Del Gratta, Francesca Frontini, Monica Monachini, Valeria Quochi, Francesco Rubino, Matteo Abrate, Angelica Lo Duca -
dc.description.fulltext none en
dc.description.note ID_PUMA; /cnr.iit/2012-A2-035 cnr.iit/2012-A2-020 -
dc.description.numberofauthors 7 -
dc.identifier.isbn 978-2-9517408-7-7 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/117790 -
dc.language.iso eng -
dc.miur.last.status.update 2024-10-02T13:14:55Z *
dc.relation.alleditors Bel N. , Gavrilidou M. , Monachini M., Quochi V., Rimell L. -
dc.relation.conferencedate 2012 -
dc.relation.conferencename The Eight International Conference on Language Resources and Evaluation (LREC) 2012 -
dc.relation.conferenceplace Istanbul, Turkey -
dc.relation.firstpage 31 -
dc.relation.ispartofbook Proceedings of the LREC 2012 Workshop on Language Resource Merging -
dc.relation.lastpage 40 -
dc.relation.numberofpages 10 -
dc.relation.projectAcronym PANACEA -
dc.relation.projectAwardNumber 248064 -
dc.relation.projectAwardTitle Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies -
dc.relation.projectFunderName - en
dc.relation.projectFundingStream FP7 -
dc.subject.keywords LMF -
dc.subject.keywords Lexicon mapping -
dc.subject.keywords similarity score -
dc.subject.singlekeyword LMF *
dc.subject.singlekeyword Lexicon mapping *
dc.subject.singlekeyword similarity score *
dc.title L-LEME: an Automatic Lexical Merger based on the LMF Standard en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 223098 -
iris.orcid.lastModifiedDate 2024/04/04 14:24:38 *
iris.orcid.lastModifiedMillisecond 1712233478051 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/117790
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact