This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding different information types in one single integrated resource. In the design of the resource we follow the ISO/DIS 24613 "Lexical Mark-up Framework" standard, which ensures reusability of the information encoded and easy exchange of both data and architecture. The design of the resource also takes into account the needs of our text mining partners who automatically extract syntactic and semantic information from texts and feed it to the lexicon. The present contribution first describes in detail the model of the BioLexicon along its three main layers: morphology, syntax and semantics; then, it briefly describes the database implementation of the model and the population strategy followed within the project, together with an example. The BioLexicon database in fact comes equipped with automatic uploading procedures based on a common exchange XML format, which guarantees that the lexicon can be properly populated with data coming from different sources.
A lexicon for biology and bioinformatics: the BOOTStrep experience
Quochi V;Monachini M;Del Gratta R;
2008
Abstract
This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding different information types in one single integrated resource. In the design of the resource we follow the ISO/DIS 24613 "Lexical Mark-up Framework" standard, which ensures reusability of the information encoded and easy exchange of both data and architecture. The design of the resource also takes into account the needs of our text mining partners who automatically extract syntactic and semantic information from texts and feed it to the lexicon. The present contribution first describes in detail the model of the BioLexicon along its three main layers: morphology, syntax and semantics; then, it briefly describes the database implementation of the model and the population strategy followed within the project, together with an example. The BioLexicon database in fact comes equipped with automatic uploading procedures based on a common exchange XML format, which guarantees that the lexicon can be properly populated with data coming from different sources.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Quochi V | en |
| dc.authority.people | Monachini M | en |
| dc.authority.people | Del Gratta R | en |
| dc.authority.people | Calzolari N | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/19 19:38:11 | - |
| dc.date.available | 2024/02/19 19:38:11 | - |
| dc.date.firstsubmission | 2024/10/02 15:55:20 | * |
| dc.date.issued | 2008 | - |
| dc.date.submission | 2024/12/06 16:43:49 | * |
| dc.description.abstract | This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding different information types in one single integrated resource. In the design of the resource we follow the ISO/DIS 24613 "Lexical Mark-up Framework" standard, which ensures reusability of the information encoded and easy exchange of both data and architecture. The design of the resource also takes into account the needs of our text mining partners who automatically extract syntactic and semantic information from texts and feed it to the lexicon. The present contribution first describes in detail the model of the BioLexicon along its three main layers: morphology, syntax and semantics; then, it briefly describes the database implementation of the model and the population strategy followed within the project, together with an example. The BioLexicon database in fact comes equipped with automatic uploading procedures based on a common exchange XML format, which guarantees that the lexicon can be properly populated with data coming from different sources. | - |
| dc.description.affiliations | Istituto di Linguistica Computazionale "A. Zampolli" | - |
| dc.description.allpeople | Quochi, V; Monachini, M; Del Gratta, R; Calzolari, N | - |
| dc.description.allpeopleoriginal | Quochi V.; Monachini M.; Del Gratta R.; Calzolari N. | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 4 | - |
| dc.identifier.isbn | 2-9517408-4-0 | en |
| dc.identifier.isi | WOS:000324028902062 | en |
| dc.identifier.scopus | 2-s2.0-84874250555 | en |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/65076 | - |
| dc.identifier.url | http://www.lrec-conf.org/proceedings/lrec2008/pdf/576_paper.pdf | en |
| dc.language.iso | eng | en |
| dc.miur.last.status.update | 2024-10-02T13:51:20Z | * |
| dc.publisher.country | FRA | en |
| dc.publisher.name | European Language Resources Association ELRA | en |
| dc.publisher.place | Paris | en |
| dc.relation.conferencedate | 26-05/1-06-2008 | en |
| dc.relation.conferencename | LREC 2008, Sixth International Conference on Language Resources and Evaluation | en |
| dc.relation.conferenceplace | Marrakech, Marocco | en |
| dc.relation.firstpage | 2285 | en |
| dc.relation.ispartofbook | LREC 2008, Sixth International Conference on Language Resources and Evaluation | en |
| dc.relation.lastpage | 2292 | en |
| dc.relation.numberofpages | 8 | en |
| dc.subject.keywordseng | Lexicon | - |
| dc.subject.keywordseng | Ontologies | - |
| dc.subject.keywordseng | Lexical database | - |
| dc.subject.singlekeyword | Lexicon | * |
| dc.subject.singlekeyword | Ontologies | * |
| dc.subject.singlekeyword | Lexical database | * |
| dc.title | A lexicon for biology and bioinformatics: the BOOTStrep experience | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | en |
| dc.ugov.descaux1 | 84700 | - |
| iris.isi.metadataErrorDescription | 0 | - |
| iris.isi.metadataErrorType | ERROR_NO_MATCH | - |
| iris.isi.metadataStatus | ERROR | - |
| iris.mediafilter.data | 2025/04/02 00:20:29 | * |
| iris.orcid.lastModifiedDate | 2024/12/16 17:20:51 | * |
| iris.orcid.lastModifiedMillisecond | 1734366051218 | * |
| iris.scopus.extIssued | 2008 | - |
| iris.scopus.extTitle | A Lexicon for biology and bioinformatics: The BOOTStrep experience | - |
| iris.scopus.ideLinkStatusDate | 2024/04/10 09:22:16 | * |
| iris.scopus.ideLinkStatusMillisecond | 1712733736255 | * |
| iris.sitodocente.maxattempts | 1 | - |
| scopus.category | 1203 | * |
| scopus.category | 3304 | * |
| scopus.category | 3310 | * |
| scopus.category | 3309 | * |
| scopus.contributor.affiliation | CNR | - |
| scopus.contributor.affiliation | CNR | - |
| scopus.contributor.affiliation | CNR | - |
| scopus.contributor.affiliation | CNR | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.auid | 34977412400 | - |
| scopus.contributor.auid | 23397766600 | - |
| scopus.contributor.auid | 34976432900 | - |
| scopus.contributor.auid | 8845912500 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.name | Valeria | - |
| scopus.contributor.name | Monica | - |
| scopus.contributor.name | Riccardo | - |
| scopus.contributor.name | Nicoletta | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale; | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale; | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale; | - |
| scopus.contributor.subaffiliation | Istituto di Linguistica Computazionale; | - |
| scopus.contributor.surname | Quochi | - |
| scopus.contributor.surname | Monachini | - |
| scopus.contributor.surname | Del Gratta | - |
| scopus.contributor.surname | Calzolari | - |
| scopus.date.issued | 2008 | * |
| scopus.description.abstract | This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding different information types in one single integrated resource. In the design of the resource we follow the ISO/DIS 24613 "Lexical Mark-up Framework" standard, which ensures reusability of the information encoded and easy exchange of both data and architecture. The design of the resource also takes into account the needs of our text mining partners who automatically extract syntactic and semantic information from texts and feed it to the lexicon. The present contribution first describes in detail the model of the BioLexicon along its three main layers: morphology, syntax and semantics; then, it briefly describes the database implementation of the model and the population strategy followed within the project, together with an example. The BioLexicon database in fact comes equipped with automatic uploading procedures based on a common exchange XML format, which guarantees that the lexicon can be properly populated with data coming from different sources. | * |
| scopus.description.allpeopleoriginal | Quochi V.; Monachini M.; Del Gratta R.; Calzolari N. | * |
| scopus.differences | scopus.relation.conferencename | * |
| scopus.differences | scopus.publisher.name | * |
| scopus.differences | scopus.relation.conferencedate | * |
| scopus.differences | scopus.identifier.isbn | * |
| scopus.differences | scopus.relation.conferenceplace | * |
| scopus.document.type | cp | * |
| scopus.document.types | cp | * |
| scopus.funding.funders | 501100000780 - European Commission; | * |
| scopus.funding.ids | FP6-028099; | * |
| scopus.identifier.isbn | 9782951740846 | * |
| scopus.identifier.pui | 619617295 | * |
| scopus.identifier.scopus | 2-s2.0-84874250555 | * |
| scopus.journal.sourceid | 21100842264 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | European Language Resources Association (ELRA) | * |
| scopus.relation.conferencedate | 2008 | * |
| scopus.relation.conferencename | 6th International Conference on Language Resources and Evaluation, LREC 2008 | * |
| scopus.relation.conferenceplace | Palais des Congres Mansour Eddahbi, mar | * |
| scopus.relation.firstpage | 2285 | * |
| scopus.relation.lastpage | 2292 | * |
| scopus.title | A Lexicon for biology and bioinformatics: The BOOTStrep experience | * |
| scopus.titleeng | A Lexicon for biology and bioinformatics: The BOOTStrep experience | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_84700-doc_85050.pdf
accesso aperto
Descrizione: A lexicon for biology and bioinformatics: the BOOTStrep experience
Licenza:
Creative commons
Dimensione
485.13 kB
Formato
Adobe PDF
|
485.13 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


