Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of scholarly entities metadata and, where possible, their relative payloads. Since such metadata information is scattered across diverse, freely accessible, online resources (e.g. Crossref, ORCID), researchers in this domain are doomed to struggle with (meta)data integration problems, in order to produce custom datasets of often undocumented and rather obscure provenance. This practice leads to waste of time, duplication of efforts, and typically infringes open science best practices of transparency and reproducibility of science. In this article, we describe how to generate DOIBoost, a metadata collection that enriches Crossref with inputs from Microsoft Academic Graph, ORCID, and Unpaywall for the purpose of supporting high-quality and robust research experiments, saving times to researchers and enabling their comparison. To this end, we describe the dataset value and its schema, analyse its actual content, and share the software Toolkit and experimental workflow required to reproduce it. The DOIBoost dataset and Software Toolkit are made openly available via Zenodo.org. DOIBoost will become an input source to the OpenAIRE information graph.

OpenAIRE's DOIBoost - Boosting Crossref for Research

La Bruzzo S;Manghi P;Mannocci A
2019

Abstract

Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of scholarly entities metadata and, where possible, their relative payloads. Since such metadata information is scattered across diverse, freely accessible, online resources (e.g. Crossref, ORCID), researchers in this domain are doomed to struggle with (meta)data integration problems, in order to produce custom datasets of often undocumented and rather obscure provenance. This practice leads to waste of time, duplication of efforts, and typically infringes open science best practices of transparency and reproducibility of science. In this article, we describe how to generate DOIBoost, a metadata collection that enriches Crossref with inputs from Microsoft Academic Graph, ORCID, and Unpaywall for the purpose of supporting high-quality and robust research experiments, saving times to researchers and enabling their comparison. To this end, we describe the dataset value and its schema, analyse its actual content, and share the software Toolkit and experimental workflow required to reproduce it. The DOIBoost dataset and Software Toolkit are made openly available via Zenodo.org. DOIBoost will become an input source to the OpenAIRE information graph.
2019
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
Manghi P., Candela L., Silvello G. (eds)
Digital Libraries: Supporting Open Science
IRCDL 2019 - Italian Research Conference on Digital Libraries
133
143
978-3-030-11225-7
https://link.springer.com/chapter/10.1007%2F978-3-030-11226-4_11#citeas
Springer
Cham, Heidelberg, New York, Dordrecht, London
SVIZZERA
Sì, ma tipo non specificato
31/01/2019, 01/2/2019
Pisa, Italy
Crossref
Data integration
Data science
Microsoft academic graph
Open science
ORCID
Scholarly communication
Unpaywall
First Online 15 January 2019
3
partially_open
La Bruzzo S.; Manghi P.; Mannocci A.
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
   OpenAIRE Advancing Open Scholarship
   OpenAIRE-Advance
   H2020
   777541
File in questo prodotto:
File Dimensione Formato  
prod_402418-doc_139942.pdf

solo utenti autorizzati

Descrizione: OpenAIRE's DOIBoost - Boosting Crossref for Research
Tipologia: Versione Editoriale (PDF)
Dimensione 697.12 kB
Formato Adobe PDF
697.12 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_402418-doc_139943.pdf

accesso aperto

Descrizione: OpenAIRE's DOIBoost - Boosting Crossref for Research
Tipologia: Versione Editoriale (PDF)
Dimensione 425.76 kB
Formato Adobe PDF
425.76 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/392495
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? ND
social impact