This paper describes the implementation of the H2IOSC resource description schema across a complete metadata pipeline, from user input to OAI-PMH protocol output. The H2IOSC project federates four European research infrastructures (OPERAS, E-RIHS, CLARIN, DARIAH) and requires a unified data model capable of describing heterogeneous scholarly resources – tools, datasets, publications, projects, services, terminology resources, learning resources, and workflows – while preserving infrastructure-specific semantics. We present the data model design, its concrete realization across four system layers (HTML form, JSON storage, native OAI-PMH XML, Dublin Core XML), and the mapping strategies employed at each transformation step. The model supports multilingual metadata, controlled vocabularies, structured identity management with role-based actor associations, and a property system that enforces type-specific constraints. We document the design decisions, the trade-offs between normalization and protocol compliance, and the validation results. The implementation is operational and serves metadata from multiple infrastructures through a validated OAI-PMH 2.0 endpoint.
From Data Model to Interoperable Metadata: Implementing the H2IOSC Resource Description Schema across a Multi-Infrastructure Pipeline
Pietro Sichera
Primo
;Cristina Marras
Co-ultimo
;Enrico Pasini
Co-ultimo
2026
Abstract
This paper describes the implementation of the H2IOSC resource description schema across a complete metadata pipeline, from user input to OAI-PMH protocol output. The H2IOSC project federates four European research infrastructures (OPERAS, E-RIHS, CLARIN, DARIAH) and requires a unified data model capable of describing heterogeneous scholarly resources – tools, datasets, publications, projects, services, terminology resources, learning resources, and workflows – while preserving infrastructure-specific semantics. We present the data model design, its concrete realization across four system layers (HTML form, JSON storage, native OAI-PMH XML, Dublin Core XML), and the mapping strategies employed at each transformation step. The model supports multilingual metadata, controlled vocabularies, structured identity management with role-based actor associations, and a property system that enforces type-specific constraints. We document the design decisions, the trade-offs between normalization and protocol compliance, and the validation results. The implementation is operational and serves metadata from multiple infrastructures through a validated OAI-PMH 2.0 endpoint.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


