This paper describes the development of a web-service tool for the automatic extraction of Multi-word expressions lexicons, which has been integrated in a distributed platform for the automatic creation of linguistic resources. The main purpose of the work described is thus to provide a (computationally "light") tool that produces a full lexical resource: multi-word terms/items with relevant and useful attached information that can be used for more complex processing tasks and applications (e.g. parsing, MT, IE, query expansion, etc.). The output of our tool is a MW lexicon formatted and encoded in XML according to the Lexical Mark-up Framework. The tool is already functional and available as a service. Evaluation experiments show that the tool precision is of about 80%.
A MWE Acquisition and Lexicon Builder Web Service
Quochi Valeria;Frontini Francesca;
2012
Abstract
This paper describes the development of a web-service tool for the automatic extraction of Multi-word expressions lexicons, which has been integrated in a distributed platform for the automatic creation of linguistic resources. The main purpose of the work described is thus to provide a (computationally "light") tool that produces a full lexical resource: multi-word terms/items with relevant and useful attached information that can be used for more complex processing tasks and applications (e.g. parsing, MT, IE, query expansion, etc.). The output of our tool is a MW lexicon formatted and encoded in XML according to the Lexical Mark-up Framework. The tool is already functional and available as a service. Evaluation experiments show that the tool precision is of about 80%.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Quochi Valeria | it |
| dc.authority.people | Frontini Francesca | it |
| dc.authority.people | Rubino Francesco | it |
| dc.authority.project | Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies | - |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/16 15:54:00 | - |
| dc.date.available | 2024/02/16 15:54:00 | - |
| dc.date.issued | 2012 | - |
| dc.description.abstracteng | This paper describes the development of a web-service tool for the automatic extraction of Multi-word expressions lexicons, which has been integrated in a distributed platform for the automatic creation of linguistic resources. The main purpose of the work described is thus to provide a (computationally "light") tool that produces a full lexical resource: multi-word terms/items with relevant and useful attached information that can be used for more complex processing tasks and applications (e.g. parsing, MT, IE, query expansion, etc.). The output of our tool is a MW lexicon formatted and encoded in XML according to the Lexical Mark-up Framework. The tool is already functional and available as a service. Evaluation experiments show that the tool precision is of about 80%. | - |
| dc.description.affiliations | CNR-ILC, Pisa | - |
| dc.description.allpeople | Quochi, Valeria; Frontini, Francesca; Rubino, Francesco | - |
| dc.description.allpeopleoriginal | Quochi, Valeria; Frontini, Francesca; Rubino, Francesco | - |
| dc.description.fulltext | none | en |
| dc.description.note | ID_PUMA: /cnr.ilc/2012-A3-007 Il volume degli atti reso disponibile da The COLING 2012 Organizing Committee, Indian Institute of Technology Bombay, a https://aclanthology.org/volumes/C12-1/ (Creative Commons Attribution 4.0 International License) | - |
| dc.description.numberofauthors | 2 | - |
| dc.identifier.isbn | 9781627483896 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/128266 | - |
| dc.identifier.url | http://aclweb.org/anthology/C/C12/C12-1140.pdf | - |
| dc.language.iso | eng | - |
| dc.publisher.country | USA | - |
| dc.publisher.name | Curran Associates | - |
| dc.publisher.place | Red Hook, NY 12571 | - |
| dc.relation.alleditors | Martin Kay and Christian Boitet | - |
| dc.relation.conferencedate | December 2012 | - |
| dc.relation.conferencename | International Conference on Computational Linguistics (COLING) | - |
| dc.relation.conferenceplace | Mumbai, India | - |
| dc.relation.firstpage | 2291 | - |
| dc.relation.ispartofbook | Proceedings of COLING 2012: Technical Papers | - |
| dc.relation.lastpage | 2306 | - |
| dc.relation.numberofpages | 16 | - |
| dc.relation.projectAcronym | PANACEA | - |
| dc.relation.projectAwardNumber | 248064 | - |
| dc.relation.projectAwardTitle | Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies | - |
| dc.relation.projectFunderName | - | en |
| dc.relation.projectFundingStream | FP7 | - |
| dc.subject.keywords | Multiword extraction | - |
| dc.subject.keywords | lexical resources | - |
| dc.subject.keywords | LMF | - |
| dc.subject.keywords | web services. | - |
| dc.subject.singlekeyword | Multiword extraction | * |
| dc.subject.singlekeyword | lexical resources | * |
| dc.subject.singlekeyword | LMF | * |
| dc.subject.singlekeyword | web services | * |
| dc.title | A MWE Acquisition and Lexicon Builder Web Service | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 220778 | - |
| dc.ugov.descaux2 | Creative Commons Attribution 4.0 International License | - |
| iris.orcid.lastModifiedDate | 2024/02/22 21:17:27 | * |
| iris.orcid.lastModifiedMillisecond | 1708633047732 | * |
| iris.scopus.extIssued | 2012 | - |
| iris.scopus.extTitle | A MWE acquisition and lexicon builder web service | - |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


