The extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possible. We are building a large-scale lexical resource for the biology domain, providing information about predicate-argument structure that has been bootstrapped from a biomedical corpus on the subject of E. Coli. The lexicon is currently focussed on verbs, and includes both automatically-extracted syntactic subcategorization frames, as well as semantic event frames that are based on annotation by domain experts. In addition, the lexicon contains manually-added explicit links between semantic and syntactic slots in corresponding frames. To our knowledge, this lexicon currently represents a unique resource within in the biomedical domain.

Bootstrapping a Verb Lexicon for Biomedical Information Extraction

Venturi G;Montemagni S;Marchi S;
2009

Abstract

The extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possible. We are building a large-scale lexical resource for the biology domain, providing information about predicate-argument structure that has been bootstrapped from a biomedical corpus on the subject of E. Coli. The lexicon is currently focussed on verbs, and includes both automatically-extracted syntactic subcategorization frames, as well as semantic event frames that are based on annotation by domain experts. In addition, the lexicon contains manually-added explicit links between semantic and syntactic slots in corresponding frames. To our knowledge, this lexicon currently represents a unique resource within in the biomedical domain.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Venturi G it
dc.authority.people Montemagni S it
dc.authority.people Marchi S it
dc.authority.people Sasaki Y it
dc.authority.people Thompson P it
dc.authority.people McNaught J it
dc.authority.people Ananiadou S it
dc.authority.project Bootstrapping Of Ontologies and Terminologies STrategic REsearch Project -
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/19 19:49:59 -
dc.date.available 2024/02/19 19:49:59 -
dc.date.issued 2009 -
dc.description.abstracteng The extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possible. We are building a large-scale lexical resource for the biology domain, providing information about predicate-argument structure that has been bootstrapped from a biomedical corpus on the subject of E. Coli. The lexicon is currently focussed on verbs, and includes both automatically-extracted syntactic subcategorization frames, as well as semantic event frames that are based on annotation by domain experts. In addition, the lexicon contains manually-added explicit links between semantic and syntactic slots in corresponding frames. To our knowledge, this lexicon currently represents a unique resource within in the biomedical domain. -
dc.description.affiliations CNR-ILC, Pisa, National Centre for Text Mining, University of Manchester, UK -
dc.description.allpeople Venturi G.; Montemagni S.; Marchi S.; Sasaki Y.; Thompson P.; McNaught J.; Ananiadou S. -
dc.description.allpeopleoriginal Venturi G.; Montemagni S.; Marchi S.; Sasaki Y.; Thompson P.; McNaught J.; Ananiadou S. -
dc.description.fulltext none en
dc.description.numberofauthors 3 -
dc.identifier.doi 10.1007/978-3-642-00382-0_11 -
dc.identifier.isbn 9783642003813 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/65110 -
dc.language.iso eng -
dc.publisher.country DEU -
dc.publisher.name Springer-Verlag -
dc.publisher.place Berlin Heidelberg -
dc.relation.conferencedate 1-7/03/2009 -
dc.relation.conferencename 10th International Conference on Intelligent Text Processing and Computational Linguistics -
dc.relation.conferenceplace Mexico City, Mexico -
dc.relation.firstpage 137 -
dc.relation.lastpage 148 -
dc.relation.numberofpages 12 -
dc.relation.projectAcronym BOOTSTREP -
dc.relation.projectAwardNumber 028099 -
dc.relation.projectAwardTitle Bootstrapping Of Ontologies and Terminologies STrategic REsearch Project -
dc.relation.projectFunderName - en
dc.relation.projectFundingStream FP6 -
dc.subject.keywords domain-specific lexical resources -
dc.subject.keywords Biological Language Processing -
dc.subject.keywords syntax-semantic linking -
dc.subject.singlekeyword domain-specific lexical resources *
dc.subject.singlekeyword Biological Language Processing *
dc.subject.singlekeyword syntax-semantic linking *
dc.title Bootstrapping a Verb Lexicon for Biomedical Information Extraction en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 84736 -
iris.isi.extIssued 2009 -
iris.isi.extTitle Bootstrapping a Verb Lexicon for Biomedical Information Extraction -
iris.orcid.lastModifiedDate 2024/03/02 03:56:48 *
iris.orcid.lastModifiedMillisecond 1709348208337 *
iris.scopus.extIssued 2009 -
iris.scopus.extTitle Bootstrapping a verb lexicon for biomedical information extraction -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.doi 10.1007/978-3-642-00382-0_11 *
iris.unpaywall.isoa false *
iris.unpaywall.journalisindoaj false *
iris.unpaywall.metadataCallLastModified 30/12/2025 03:38:32 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1767062312897 -
iris.unpaywall.oastatus closed *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/65110
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact