In the domain of biological classification there are several taxon name matching services that can search for a species scientific name in a large collection of taxonomic names. Many of these services are available online, and many others run on computers of individual scientists. While these systems may work very well, most suffer from the fact that the list of names used as a reference, and the criteria to decide on a match, are hard-coded in the engine that performs the name matching. In this paper we present BiOnym, a taxon name matching system that separates reference names lists, search criteria and matching engine. The user is offered a choice of several taxonomic reference lists, including the option to upload his/her own list onto the system. Furthermore, BiOnym is a flexible workow, which embeds and combines techniques using lexical matching algorithms as well as expert knowledge. It is also an open platform allowing developers to contribute with new techniques. In this paper we demonstrate the benefits brought by this approach in terms of the efficiency and effectiveness of the information retrieval process with respect to other solutions.

Retrieving taxa names from large biodiversity data collections using a flexible matching workflow

Pagano P
2015

Abstract

In the domain of biological classification there are several taxon name matching services that can search for a species scientific name in a large collection of taxonomic names. Many of these services are available online, and many others run on computers of individual scientists. While these systems may work very well, most suffer from the fact that the list of names used as a reference, and the criteria to decide on a match, are hard-coded in the engine that performs the name matching. In this paper we present BiOnym, a taxon name matching system that separates reference names lists, search criteria and matching engine. The user is offered a choice of several taxonomic reference lists, including the option to upload his/her own list onto the system. Furthermore, BiOnym is a flexible workow, which embeds and combines techniques using lexical matching algorithms as well as expert knowledge. It is also an open platform allowing developers to contribute with new techniques. In this paper we demonstrate the benefits brought by this approach in terms of the efficiency and effectiveness of the information retrieval process with respect to other solutions.
2015
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Taxon names matching
Taxonomic Authority File
Taxon Name Parsing
Name Matcher Chain
Taxonomy
Taxonomic nomenclature
Digital Libraries
File in questo prodotto:
File Dimensione Formato  
prod_331989-doc_102655.pdf

accesso aperto

Descrizione: Preprint - Retrieving taxa names from large biodiversity data collections using a flexible matching workflow
Tipologia: Versione Editoriale (PDF)
Dimensione 689.21 kB
Formato Adobe PDF
689.21 kB Adobe PDF Visualizza/Apri
prod_331989-doc_107782.pdf

solo utenti autorizzati

Descrizione: Retrieving taxa names from large biodiversity data collections using a flexible matching workflow
Tipologia: Versione Editoriale (PDF)
Dimensione 695.99 kB
Formato Adobe PDF
695.99 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/298071
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 17
social impact