Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of scholarly entities metadata and, where possible, their relative payloads. Since such metadata information is scattered across diverse, freely accessible, online resources (e.g. Crossref, ORCID), researchers in this domain are doomed to struggle with (meta)data integration problems, in order to produce custom datasets of often undocumented and rather obscure provenance. This practice leads to waste of time, duplication of efforts, and typically infringes open science best practices of transparency and reproducibility of science. In this article, we describe how to generate DOIBoost, a metadata collection that enriches Crossref with inputs from Microsoft Academic Graph, ORCID, and Unpaywall for the purpose of supporting high-quality and robust research experiments, saving times to researchers and enabling their comparison. To this end, we describe the dataset value and its schema, analyse its actual content, and share the software Toolkit and experimental workflow required to reproduce it. The DOIBoost dataset and Software Toolkit are made openly available via Zenodo.org. DOIBoost will become an input source to the OpenAIRE information graph.

OpenAIRE's DOIBoost - Boosting Crossref for Research

La Bruzzo S;Manghi P;Mannocci A
2019

Abstract

Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of scholarly entities metadata and, where possible, their relative payloads. Since such metadata information is scattered across diverse, freely accessible, online resources (e.g. Crossref, ORCID), researchers in this domain are doomed to struggle with (meta)data integration problems, in order to produce custom datasets of often undocumented and rather obscure provenance. This practice leads to waste of time, duplication of efforts, and typically infringes open science best practices of transparency and reproducibility of science. In this article, we describe how to generate DOIBoost, a metadata collection that enriches Crossref with inputs from Microsoft Academic Graph, ORCID, and Unpaywall for the purpose of supporting high-quality and robust research experiments, saving times to researchers and enabling their comparison. To this end, we describe the dataset value and its schema, analyse its actual content, and share the software Toolkit and experimental workflow required to reproduce it. The DOIBoost dataset and Software Toolkit are made openly available via Zenodo.org. DOIBoost will become an input source to the OpenAIRE information graph.
2019
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-3-030-11225-7
Crossref
Data integration
Data science
Microsoft academic graph
Open science
ORCID
Scholarly communication
Unpaywall
File in questo prodotto:
File Dimensione Formato  
prod_402418-doc_139942.pdf

solo utenti autorizzati

Descrizione: OpenAIRE's DOIBoost - Boosting Crossref for Research
Tipologia: Versione Editoriale (PDF)
Dimensione 697.12 kB
Formato Adobe PDF
697.12 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_402418-doc_139943.pdf

accesso aperto

Descrizione: OpenAIRE's DOIBoost - Boosting Crossref for Research
Tipologia: Versione Editoriale (PDF)
Dimensione 425.76 kB
Formato Adobe PDF
425.76 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/392495
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? ND
social impact