The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications

L-LEME: an Automatic Lexical Merger based on the LMF Standard

Riccardo Del Gratta;Francesca Frontini;Monica Monachini;Valeria Quochi;Matteo Abrate;Angelica Lo Duca
2012

Abstract

The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications
2012
Istituto di informatica e telematica - IIT
Istituto di linguistica computazionale "Antonio Zampolli" - ILC
Inglese
Bel N. , Gavrilidou M. , Monachini M., Quochi V., Rimell L.
Proceedings of the LREC 2012 Workshop on Language Resource Merging
The Eight International Conference on Language Resources and Evaluation (LREC) 2012
31
40
10
978-2-9517408-7-7
Sì, ma tipo non specificato
2012
Istanbul, Turkey
LMF
Lexicon mapping
similarity score
ID_PUMA; /cnr.iit/2012-A2-035 cnr.iit/2012-A2-020
7
none
DEL GRATTA, Riccardo; Frontini, Francesca; Monachini, Monica; Quochi, Valeria; Rubino, Francesco; Abrate, Matteo; LO DUCA, Angelica
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
   Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies
   PANACEA
   FP7
   248064
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/117790
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact