CNR Institutional Research Information System

The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications

L-LEME: an Automatic Lexical Merger based on the LMF Standard

Riccardo Del Gratta;Francesca Frontini;Monica Monachini;Valeria Quochi;Francesco Rubino;Matteo Abrate;Angelica Lo Duca

2012

Abstract

The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.orgunit	Istituto di informatica e telematica - IIT	-
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	-
dc.authority.people	Riccardo Del Gratta	it
dc.authority.people	Francesca Frontini	it
dc.authority.people	Monica Monachini	it
dc.authority.people	Valeria Quochi	it
dc.authority.people	Francesco Rubino	it
dc.authority.people	Matteo Abrate	it
dc.authority.people	Angelica Lo Duca	it
dc.authority.project	Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies	-
dc.collection.id.s	71c7200a-7c5f-4e83-8d57-d3d2ba88f40d	*
dc.collection.name	04.01 Contributo in Atti di convegno	*
dc.contributor.appartenenza	Istituto di informatica e telematica - IIT	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	912	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/02/15 19:19:15	-
dc.date.available	2024/02/15 19:19:15	-
dc.date.issued	2012	-
dc.description.abstracteng	The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications	-
dc.description.affiliations	CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-ILC, Pisa, Italy; CNR-IIT, Pisa, Italy; CNR-IIT, Pisa, Italy	-
dc.description.allpeople	DEL GRATTA, Riccardo; Frontini, Francesca; Monachini, Monica; Quochi, Valeria; Rubino, Francesco; Abrate, Matteo; LO DUCA, Angelica	-
dc.description.allpeopleoriginal	Riccardo Del Gratta, Francesca Frontini, Monica Monachini, Valeria Quochi, Francesco Rubino, Matteo Abrate, Angelica Lo Duca	-
dc.description.fulltext	none	en
dc.description.note	ID_PUMA; /cnr.iit/2012-A2-035 cnr.iit/2012-A2-020	-
dc.description.numberofauthors	7	-
dc.identifier.isbn	978-2-9517408-7-7	-
dc.identifier.uri	https://hdl.handle.net/20.500.14243/117790	-
dc.language.iso	eng	-
dc.miur.last.status.update	2024-10-02T13:14:55Z	*
dc.relation.alleditors	Bel N. , Gavrilidou M. , Monachini M., Quochi V., Rimell L.	-
dc.relation.conferencedate	2012	-
dc.relation.conferencename	The Eight International Conference on Language Resources and Evaluation (LREC) 2012	-
dc.relation.conferenceplace	Istanbul, Turkey	-
dc.relation.firstpage	31	-
dc.relation.ispartofbook	Proceedings of the LREC 2012 Workshop on Language Resource Merging	-
dc.relation.lastpage	40	-
dc.relation.numberofpages	10	-
dc.relation.projectAcronym	PANACEA	-
dc.relation.projectAwardNumber	248064	-
dc.relation.projectAwardTitle	Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies	-
dc.relation.projectFunderName	-	en
dc.relation.projectFundingStream	FP7	-
dc.subject.keywords	LMF	-
dc.subject.keywords	Lexicon mapping	-
dc.subject.keywords	similarity score	-
dc.subject.singlekeyword	LMF	*
dc.subject.singlekeyword	Lexicon mapping	*
dc.subject.singlekeyword	similarity score	*
dc.title	L-LEME: an Automatic Lexical Merger based on the LMF Standard	en
dc.type.driver	info:eu-repo/semantics/conferenceObject	-
dc.type.full	04 Contributo in convegno::04.01 Contributo in Atti di convegno	it
dc.type.miur	273	-
dc.type.referee	Sì, ma tipo non specificato	-
dc.ugov.descaux1	223098	-
iris.orcid.lastModifiedDate	2024/04/04 14:24:38	*
iris.orcid.lastModifiedMillisecond	1712233478051	*
iris.sitodocente.maxattempts	1	-
Appare nelle tipologie:	04.01 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/117790

Citazioni

ND

ND

ND

social impact