This paper presents the design and implementation of a multi-infrastructure OAI-PMH 2.0 endpoint developed within the H2IOSC project, an Italian initiative aimed at strengthening the digital infrastructure for Social Sciences and Humanities (SSH). The system enables multiple European research infrastructures – including OPERAS, E-RIHS, and CLARIN – to expose their resource metadata through a unified harvesting interface, supporting both Dublin Core and a custom native metadata format. We describe the complete system architecture, which comprises a web-based data entry application, a secured backend API, a Git-based data repository, and a fully compliant OAI-PMH server. A distinctive contribution of this work is the detailed description of the data lifecycle: how a single metadata field travels from the user interface, through the JSON data layer, to its final representation in the OAI-PMH XML response. The system has been validated against the official OpenArchives OAI-PMH validator and is currently in production use. The architecture is designed for replicability and can be adopted by other projects requiring multi-tenant metadata harvesting capabilities.

A Multi-Infrastructure OAI-PMH Endpoint for Research Infrastructure Orchestration in the H2IOSC Project

Pietro Sichera
Primo
;
Cristina Marras
Co-ultimo
;
Enrico Pasini
Co-ultimo
2026

Abstract

This paper presents the design and implementation of a multi-infrastructure OAI-PMH 2.0 endpoint developed within the H2IOSC project, an Italian initiative aimed at strengthening the digital infrastructure for Social Sciences and Humanities (SSH). The system enables multiple European research infrastructures – including OPERAS, E-RIHS, and CLARIN – to expose their resource metadata through a unified harvesting interface, supporting both Dublin Core and a custom native metadata format. We describe the complete system architecture, which comprises a web-based data entry application, a secured backend API, a Git-based data repository, and a fully compliant OAI-PMH server. A distinctive contribution of this work is the detailed description of the data lifecycle: how a single metadata field travels from the user interface, through the JSON data layer, to its final representation in the OAI-PMH XML response. The system has been validated against the official OpenArchives OAI-PMH validator and is currently in production use. The architecture is designed for replicability and can be adopted by other projects requiring multi-tenant metadata harvesting capabilities.
2026
Istituto per il Lessico Intellettuale Europeo e Storia delle Idee - ILIESI
OAI-PMH, H2IOSC, OPERAS, OPERAS-IT, research infrastructure, metadata harvesting, infrastructure orchestration, SSH, Social Sciences and Humanities, Dublin Core, interoperabilityFlask Python
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/579556
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact