Progress in understanding, managing, and securing current and future ecosystem functions and services is challenged by fragmented and dispersed ecosystem research. As the topic is often approached using narrow disciplinary perspectives, a holistic understanding of complex eco- and socio-ecological systems is hampered and prevented. The emerging European Long-Term Ecosystem, critical zone and socio-ecological systems Research Infrastructure (eLTER) aims to overcome this challenge by addressing this issue in the ecosystem and biodiversity domain and thereby closing this gap in the European RI landscape. With its concept of the 'Information Clusters' eLTER aims to provide a framework to lower the barrier to information access and exchange. The main idea behind the concept is to simplify the harvesting and user uptake of data from multiple information sources, facilitating the integration with eLTER data by making use of existing services, like Copernicus or statistical information. The selection of sources and content of relevant data layers is the result of an internal discussion where the Research Challenges (RC) play the main role by identifying the current requirements for environmental research and the ensuing demand for external data. The overarching framework of the eLTER Standard Observations informs this process. In order to achieve the implementation of 'Information Clusters', three different data sources have been identified to complement eLTER observations and analysis: (a) in-situ legacy and third party data, (b) data from official statistics, and (c) remote sensing data and products. The activities described in the report focus on the collection and exemplary retrieval of relevant in-situ legacy data, which we identified as complementary data sources and could play an important role within the planned eLTER data analysis workflows. This is relevant to (a) get additional data for data analysis or visualisation, (b) retrieve data from eLTER sites provided by national level catalogues, and (c) retrieve data from eLTER sites provided to other relevant RIs or monitoring networks. The aim of task 4.1 was to develop and test workflows for access and basic level harmonisation of relevant in-situ data sources on global, continental and national scale. We focused on data requirements defined both by the RC addressed in the eLTER PLUS project as well as the needs for supporting the implementation of data flows defined by the eLTER SOs. We identified 176 legacy and third party data sources which could be assigned to a respective eLTER SO and which sufficiently cover each component of the Ecological Integrity concept. Based on a generic workflow described in the report we tested through demonstrators exemplary data extraction workflows being of relevance in the project context. This demonstrators focused on: (a) retrieve occurrence biodiversity data based on API access, (b) retrieve harmonised site gas flux observation data based on downloads, (c) retrieve data from E-OBS historic data (Copernicus Climate Change Service, 2020) to calculate climate diagrams for sites, (d) retrieve data from gridded and modelled data (e.g. E-OBS) based on the site extent, and (e) retrieve earth observation data products based on site extent. It could be shown that the selected workflows are, at least on a prototype level, operational and are useful for the eLTER PLUS users. We applied a co-design process including the respective RC leads and Science Case (SC) contributors in the design and implementation phase on a regular basis. However, eLTER needs to decide if eLTER Information Clusters focus on on-demand services for extracting information sources or pre-calculated datasets. The results of the work done in task 4.1 provide input to the design and architecture of the extended eLTER Information System led by WP11 and the further definition of workflows towards the eLTER Standard Data Products led by WP10. The report summarises the work done with respect to define and prototype workflows for the retrieval and harmonisation of legacy data. It specifically focuses on priority variables defined by the eLTER SO and aims to support Research Challenge related Science Cases at both, site and network scale. The first section describes the context of the work done, also in relation to the 'Information Clusters' concept, which aims to enhance findability and accessibility of relevant data sources in the eLTER context. The second section lists identified relevant data sources relevant in this context and provides demonstrators for data retrieval and harmonisation in the third part. We finally discuss and provide recommendations for the eLTER Information Clusters that focus on thematic prioritisation, structural and legal interoperability as well as outline next steps for the implementation. The annexes provide detailed information shown in the report only in aggregated format.

Deliverable D4.1 - Workflow for retrieval and harmonisation of legacy data

Alessandro Oggioni;Martina Zilioli;Paolo Tagliolato;Carmela Marangi;Antonello Provenzale;Alice Baronetti;Angelica Parisi;
2022

Abstract

Progress in understanding, managing, and securing current and future ecosystem functions and services is challenged by fragmented and dispersed ecosystem research. As the topic is often approached using narrow disciplinary perspectives, a holistic understanding of complex eco- and socio-ecological systems is hampered and prevented. The emerging European Long-Term Ecosystem, critical zone and socio-ecological systems Research Infrastructure (eLTER) aims to overcome this challenge by addressing this issue in the ecosystem and biodiversity domain and thereby closing this gap in the European RI landscape. With its concept of the 'Information Clusters' eLTER aims to provide a framework to lower the barrier to information access and exchange. The main idea behind the concept is to simplify the harvesting and user uptake of data from multiple information sources, facilitating the integration with eLTER data by making use of existing services, like Copernicus or statistical information. The selection of sources and content of relevant data layers is the result of an internal discussion where the Research Challenges (RC) play the main role by identifying the current requirements for environmental research and the ensuing demand for external data. The overarching framework of the eLTER Standard Observations informs this process. In order to achieve the implementation of 'Information Clusters', three different data sources have been identified to complement eLTER observations and analysis: (a) in-situ legacy and third party data, (b) data from official statistics, and (c) remote sensing data and products. The activities described in the report focus on the collection and exemplary retrieval of relevant in-situ legacy data, which we identified as complementary data sources and could play an important role within the planned eLTER data analysis workflows. This is relevant to (a) get additional data for data analysis or visualisation, (b) retrieve data from eLTER sites provided by national level catalogues, and (c) retrieve data from eLTER sites provided to other relevant RIs or monitoring networks. The aim of task 4.1 was to develop and test workflows for access and basic level harmonisation of relevant in-situ data sources on global, continental and national scale. We focused on data requirements defined both by the RC addressed in the eLTER PLUS project as well as the needs for supporting the implementation of data flows defined by the eLTER SOs. We identified 176 legacy and third party data sources which could be assigned to a respective eLTER SO and which sufficiently cover each component of the Ecological Integrity concept. Based on a generic workflow described in the report we tested through demonstrators exemplary data extraction workflows being of relevance in the project context. This demonstrators focused on: (a) retrieve occurrence biodiversity data based on API access, (b) retrieve harmonised site gas flux observation data based on downloads, (c) retrieve data from E-OBS historic data (Copernicus Climate Change Service, 2020) to calculate climate diagrams for sites, (d) retrieve data from gridded and modelled data (e.g. E-OBS) based on the site extent, and (e) retrieve earth observation data products based on site extent. It could be shown that the selected workflows are, at least on a prototype level, operational and are useful for the eLTER PLUS users. We applied a co-design process including the respective RC leads and Science Case (SC) contributors in the design and implementation phase on a regular basis. However, eLTER needs to decide if eLTER Information Clusters focus on on-demand services for extracting information sources or pre-calculated datasets. The results of the work done in task 4.1 provide input to the design and architecture of the extended eLTER Information System led by WP11 and the further definition of workflows towards the eLTER Standard Data Products led by WP10. The report summarises the work done with respect to define and prototype workflows for the retrieval and harmonisation of legacy data. It specifically focuses on priority variables defined by the eLTER SO and aims to support Research Challenge related Science Cases at both, site and network scale. The first section describes the context of the work done, also in relation to the 'Information Clusters' concept, which aims to enhance findability and accessibility of relevant data sources in the eLTER context. The second section lists identified relevant data sources relevant in this context and provides demonstrators for data retrieval and harmonisation in the third part. We finally discuss and provide recommendations for the eLTER Information Clusters that focus on thematic prioritisation, structural and legal interoperability as well as outline next steps for the implementation. The annexes provide detailed information shown in the report only in aggregated format.
2022
Istituto Applicazioni del Calcolo ''Mauro Picone''
Istituto di Geoscienze e Georisorse - IGG - Sede Pisa
Istituto per il Rilevamento Elettromagnetico dell'Ambiente - IREA
Rapporto intermedio di progetto
Rapporto intermedio di progetto
eLTER-Plus
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/449292
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact