Deliverable D3.1 "Open Science Data Analytics Technologies" is a deliverable of type Demonstrator meaning that it manifests in artefacts (software releases) other than reports. In particular, the deliverable is about the software realising the Data Analytics & Processing Layer of the AGINFRA+. This software is part of a large software system named gCube (www.gcube-system.org). The gCube system offers a large array of services supporting the entire lifecycle underlying a research activity (data management and collation, analytics, collaboration, sharing) and the possibility to combine these services in Virtual Research Environments1. In the context of AGINFRA PLUS the following gCube components have been primarily exploited, consolidated and enhanced to serve the analytics needs arising in the context of the project use cases. DataMiner, i.e. a service enacting its users to perform data analytics tasks by relying on an array of analytics methods and a distributed and heterogeneous computing infrastructure. This service is available by a web-based GUI as well as via a web-based API based on the OGC WPS standard. SAI (Statistical Algorithm Importer), i.e. a service enacting its users to make available their own analytics methods via the DataMiner service. In addition to that, the entire analytics solution made available for AGINFRA PLUS cases counts on (i) a shared workspace realising a cloud-based file manager for managing content of interest and sharing this content with co-workers, (ii) a social networking area enabling users to post messages and have discussions, (iii) a flexible catalogue enabling to publish and discover items of interest including "research objects" resulting from an analytics task. This technology is deployed in its latest version in every Virtual Research Environment supporting AGINFRA PLUS cases2. The major enhancements to the technology pertaining to AGINFRA PLUS have been included in three gCube major releases3 4.7 (October 2017), 4.8 (November 2017), and 4.9 (under production).In particular, with these releases a new"black-box" oriented approach (https://wiki.gcubesystem. org/gcube/Statistical_Algorithms_Importer:_Java_Project#Black_Box_Integration)has been envisaged and implemented to enact analytics method owners and developers to easily integrate theirsolutions into the DataMinerservice. Among the supported black-box typologies there is that for KNIME workflows, i.e. analytics methods implemented by a KNIME workflow. KNIME is among the key technologies supporting the Food Safety Risk Assessment cases. In order to enact the execution of KNIME-based black-boxes, the distributed computing part of the data analytics platform has been extended to integrate the KNIME execution engine. Other cases are counting on the same mechanism to integrate entire applications (WOFOST4) as well as Python-based methods.

AGINFRA PLUS - Open Science Data Analytics Technologies D3.1

Candela L;Cirillo R;Coro G;Lelii L;Pagano P;Panichi G;Scarponi P;Sinibaldi F
2017

Abstract

Deliverable D3.1 "Open Science Data Analytics Technologies" is a deliverable of type Demonstrator meaning that it manifests in artefacts (software releases) other than reports. In particular, the deliverable is about the software realising the Data Analytics & Processing Layer of the AGINFRA+. This software is part of a large software system named gCube (www.gcube-system.org). The gCube system offers a large array of services supporting the entire lifecycle underlying a research activity (data management and collation, analytics, collaboration, sharing) and the possibility to combine these services in Virtual Research Environments1. In the context of AGINFRA PLUS the following gCube components have been primarily exploited, consolidated and enhanced to serve the analytics needs arising in the context of the project use cases. DataMiner, i.e. a service enacting its users to perform data analytics tasks by relying on an array of analytics methods and a distributed and heterogeneous computing infrastructure. This service is available by a web-based GUI as well as via a web-based API based on the OGC WPS standard. SAI (Statistical Algorithm Importer), i.e. a service enacting its users to make available their own analytics methods via the DataMiner service. In addition to that, the entire analytics solution made available for AGINFRA PLUS cases counts on (i) a shared workspace realising a cloud-based file manager for managing content of interest and sharing this content with co-workers, (ii) a social networking area enabling users to post messages and have discussions, (iii) a flexible catalogue enabling to publish and discover items of interest including "research objects" resulting from an analytics task. This technology is deployed in its latest version in every Virtual Research Environment supporting AGINFRA PLUS cases2. The major enhancements to the technology pertaining to AGINFRA PLUS have been included in three gCube major releases3 4.7 (October 2017), 4.8 (November 2017), and 4.9 (under production).In particular, with these releases a new"black-box" oriented approach (https://wiki.gcubesystem. org/gcube/Statistical_Algorithms_Importer:_Java_Project#Black_Box_Integration)has been envisaged and implemented to enact analytics method owners and developers to easily integrate theirsolutions into the DataMinerservice. Among the supported black-box typologies there is that for KNIME workflows, i.e. analytics methods implemented by a KNIME workflow. KNIME is among the key technologies supporting the Food Safety Risk Assessment cases. In order to enact the execution of KNIME-based black-boxes, the distributed computing part of the data analytics platform has been extended to integrate the KNIME execution engine. Other cases are counting on the same mechanism to integrate entire applications (WOFOST4) as well as Python-based methods.
2017
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Rapporto intermedio di progetto
Data Analytics & Processing Layer
gCube components
File in questo prodotto:
File Dimensione Formato  
prod_384157-doc_131134.pdf

accesso aperto

Descrizione: D3.1 Open Science Data AnalyticsTechnologies
Dimensione 708.27 kB
Formato Adobe PDF
708.27 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/344021
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact