The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a hilly-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.

De-duplicating the OpenAIRE scholarly communication big graph

Atzori C;Manghi P;Bardi A
2018

Abstract

The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a hilly-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.
2018
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-1-5386-9156-4
Deduplication
Graph
Big data
Scholarly communication
OpenAIRE
File in questo prodotto:
File Dimensione Formato  
prod_402402-doc_139936.pdf

solo utenti autorizzati

Descrizione: De-duplicating the OpenAIRE Scholarly Communication Big Graph
Tipologia: Versione Editoriale (PDF)
Dimensione 113.65 kB
Formato Adobe PDF
113.65 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_402402-doc_139937.pdf

accesso aperto

Descrizione: De-duplicating the OpenAIRE Scholarly Communication Big Graph (poster presentation)
Tipologia: Versione Editoriale (PDF)
Dimensione 449.16 kB
Formato Adobe PDF
449.16 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/392479
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact