This paper describes the design, implementation and population of the BioLexicon in the framework of BootStrep, an FP6 project. The BioLexicon (BL) is a lexical resource designed for text mining in the bio-domain. It has been conceived to meet both domain requirements and upcoming ISO standards for lexical representation. The data model and data categories are compliant to the ISO Lexical Markup Framework and the Data Category Registry. The BioLexicon integrates features of lexicons and terminologies: term entries (and variants) derived from existing resources are enriched with linguistic features, including sub-categorization and predicate-argument information, extracted from texts. Thus, it is an extendable resource. Furthermore, the lexical entries will be aligned to concepts in the BioOntology, the ontological resource of the project. The BL implementation is an extensible relational database with automatic population procedures. Population relies on a dedicated input data structure allowing to upload terms and their linguistic properties and "pull-and-push" them in the database. The BioLexicon teaches that the state-of-the-art is mature enough to aim at setting up a standard in this domain. Being conformant to lexical standards, the BioLexicon is interoperable and portable to other areas.

Using LMF to Shape a Lexicon for the Biomedical Domain

Monachini M;Quochi V;Del Gratta R;
2008

Abstract

This paper describes the design, implementation and population of the BioLexicon in the framework of BootStrep, an FP6 project. The BioLexicon (BL) is a lexical resource designed for text mining in the bio-domain. It has been conceived to meet both domain requirements and upcoming ISO standards for lexical representation. The data model and data categories are compliant to the ISO Lexical Markup Framework and the Data Category Registry. The BioLexicon integrates features of lexicons and terminologies: term entries (and variants) derived from existing resources are enriched with linguistic features, including sub-categorization and predicate-argument information, extracted from texts. Thus, it is an extendable resource. Furthermore, the lexical entries will be aligned to concepts in the BioOntology, the ontological resource of the project. The BL implementation is an extensible relational database with automatic population procedures. Population relies on a dedicated input data structure allowing to upload terms and their linguistic properties and "pull-and-push" them in the database. The BioLexicon teaches that the state-of-the-art is mature enough to aim at setting up a standard in this domain. Being conformant to lexical standards, the BioLexicon is interoperable and portable to other areas.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Monachini M en
dc.authority.people Quochi V en
dc.authority.people Del Gratta R en
dc.authority.people Calzolari N en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/19 19:48:39 -
dc.date.available 2024/02/19 19:48:39 -
dc.date.firstsubmission 2024/10/02 16:01:08 *
dc.date.issued 2008 -
dc.date.submission 2024/10/02 16:01:08 *
dc.description.abstracteng This paper describes the design, implementation and population of the BioLexicon in the framework of BootStrep, an FP6 project. The BioLexicon (BL) is a lexical resource designed for text mining in the bio-domain. It has been conceived to meet both domain requirements and upcoming ISO standards for lexical representation. The data model and data categories are compliant to the ISO Lexical Markup Framework and the Data Category Registry. The BioLexicon integrates features of lexicons and terminologies: term entries (and variants) derived from existing resources are enriched with linguistic features, including sub-categorization and predicate-argument information, extracted from texts. Thus, it is an extendable resource. Furthermore, the lexical entries will be aligned to concepts in the BioOntology, the ontological resource of the project. The BL implementation is an extensible relational database with automatic population procedures. Population relies on a dedicated input data structure allowing to upload terms and their linguistic properties and "pull-and-push" them in the database. The BioLexicon teaches that the state-of-the-art is mature enough to aim at setting up a standard in this domain. Being conformant to lexical standards, the BioLexicon is interoperable and portable to other areas. -
dc.description.affiliations ILC-CNR -
dc.description.allpeople Monachini, M; Quochi, V; Del Gratta, R; Calzolari, N -
dc.description.allpeopleoriginal Monachini M.; Quochi V.; Del Gratta R.; Calzolari N. en
dc.description.fulltext none en
dc.description.numberofauthors 4 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/65105 -
dc.language.iso eng en
dc.miur.last.status.update 2024-10-02T13:57:38Z *
dc.relation.alleditors C. Delogu; M. Falcone (eds.) en
dc.relation.conferencedate 28-29 February 2008 en
dc.relation.conferencename LangTech 2008 - Tecnologia applicata alla linguistica en
dc.relation.conferenceplace Roma en
dc.relation.firstpage 153 en
dc.relation.ispartofbook LangTech 2008 - Tecnologia applicata alla linguistica en
dc.relation.lastpage 157 en
dc.relation.numberofpages 5 en
dc.subject.keywords Domain terminologies -
dc.subject.keywords Computational lexicons -
dc.subject.keywords Lexical standards -
dc.subject.keywords Lexical architectures -
dc.subject.singlekeyword Domain terminologies *
dc.subject.singlekeyword Computational lexicons *
dc.subject.singlekeyword Lexical standards *
dc.subject.singlekeyword Lexical architectures *
dc.title Using LMF to Shape a Lexicon for the Biomedical Domain en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato en
dc.ugov.descaux1 84731 -
iris.orcid.lastModifiedDate 2024/12/03 14:33:25 *
iris.orcid.lastModifiedMillisecond 1733232805263 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/65105
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact