The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a hilly-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.

De-duplicating the OpenAIRE scholarly communication big graph

Atzori C;Manghi P;Bardi A
2018

Abstract

The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a hilly-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.
2018
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
e-science 2018 - 14th IEEE International Conference on e-Science (e-Science)
372
373
2
978-1-5386-9156-4
https://ieeexplore.ieee.org/document/8588723
Sì, ma tipo non specificato
29 October - 01 November 2018
Amsterdam, the Netherlands
Deduplication
Graph
Big data
Scholarly communication
OpenAIRE
Article number 8588723
3
partially_open
Atzori, C; Manghi, P; Bardi, A
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
   Open Access Infrastructure for Research in Europe 2020
   OpenAIRE2020
   H2020
   643410

   OpenAIRE Advancing Open Scholarship
   OpenAIRE-Advance
   H2020
   777541
File in questo prodotto:
File Dimensione Formato  
prod_402402-doc_139936.pdf

solo utenti autorizzati

Descrizione: De-duplicating the OpenAIRE Scholarly Communication Big Graph
Tipologia: Versione Editoriale (PDF)
Dimensione 113.65 kB
Formato Adobe PDF
113.65 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_402402-doc_139937.pdf

accesso aperto

Descrizione: De-duplicating the OpenAIRE Scholarly Communication Big Graph (poster presentation)
Tipologia: Versione Editoriale (PDF)
Dimensione 449.16 kB
Formato Adobe PDF
449.16 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/392479
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
social impact