Federated Search enables data accessibility under privacy-preserving regulations. One of the strategic objectives of the BBMRI-ERIC biobanking infrastructure is to make high-quality samples findable via Federated Search. The main prerequisites for biobanks to join the Federated Network are the conversion of a local database into a Common Data Model and the setup of a server node in which the database is loaded and made accessible for external queries. Data conversion is often the most critical step for institutions, as many of them lack the technical expertise needed to improve the FAIRness of their data. This is achieved through a local Extraction, Transformation and Loading process of data, usually extracted from a Biobank Information Management System. We hereby present a framework for the conversion of minimal information datasets into HL7-FHIR transaction bundles, allowing basic Biobank Interoperability, and enabling biobanks to be connected to the BBMRI-ERIC European Federated Platform. The toolkit consists of several Python modules, creating JSON files, ready to be uploaded to an internal FHIR server connected to the federated network, enabling data sharing and query execution. This tool has been successfully integrated in three BBMRI.it biobanks, allowing them to share their data correctly. In general, this tool will enforce data harmonization and standardization among research infrastructures, integrating the current pipeline into local information systems. The framework is available at https://github.com/bbdataeng/a-small-fire.
An Extract, Transform, Load foundation for Biobank Data Interoperability
Antonella Cruoglio;Federica Rossi;Davide Fragnito;Ramona Palombo;Alice Massacci;Massimiliano Borsani;Claudia Miele;Matteo Pallocca
2025
Abstract
Federated Search enables data accessibility under privacy-preserving regulations. One of the strategic objectives of the BBMRI-ERIC biobanking infrastructure is to make high-quality samples findable via Federated Search. The main prerequisites for biobanks to join the Federated Network are the conversion of a local database into a Common Data Model and the setup of a server node in which the database is loaded and made accessible for external queries. Data conversion is often the most critical step for institutions, as many of them lack the technical expertise needed to improve the FAIRness of their data. This is achieved through a local Extraction, Transformation and Loading process of data, usually extracted from a Biobank Information Management System. We hereby present a framework for the conversion of minimal information datasets into HL7-FHIR transaction bundles, allowing basic Biobank Interoperability, and enabling biobanks to be connected to the BBMRI-ERIC European Federated Platform. The toolkit consists of several Python modules, creating JSON files, ready to be uploaded to an internal FHIR server connected to the federated network, enabling data sharing and query execution. This tool has been successfully integrated in three BBMRI.it biobanks, allowing them to share their data correctly. In general, this tool will enforce data harmonization and standardization among research infrastructures, integrating the current pipeline into local information systems. The framework is available at https://github.com/bbdataeng/a-small-fire.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


