In this paper we describe our experience in exploiting different cloud-based environments for an actual use case taken from the bioinformatics domain - the molecular surfaces analysis - that identifies similarities and possible complementarities in the protein surfaces. The analysis of macromolecular surfaces is important since protein surface conformations drive many biological reactions. We developed a workflow that performs the macromolecular surfaces analysis and provides interesting results from a scientific point of view. An important issue is represented by the fact that it is highly compute-intensive, therefore it cannot be run on a single CPU system for meaningful use cases and a parallel infrastructure is required to obtain reasonable execution time. For a decade grid infrastructures have represented suitable solutions to achieve cost effective computational power for Bioinformatics applications. However, these solutions do not offer an adequate customisation of the computational environment (e.g. installing databases and configuring virtual network) due to the rigid organisation of the storage and computational sites. Running applications on customised machines obtained by user-defined images simplifies the computing model, decreases the failure rates and therefore reduces waiting times for production analysis with respect to the canonical grid computations. For these reasons a cloud-based approach is more suitable than a pure grid paradigm. We experimented using two cloud-based approaches, based on the Worker Node On Demand Service and on OpenStack, to run the molecular surfaces analysis use case and we compared the results in terms of performance, efficiency and efforts to build the computing model with respect to grid computing.

Porting bioinformatics applications from grid to cloud: a macromolecular surface analysis application case study

I Merelli;P Cozzi;D D'Agostino
2017

Abstract

In this paper we describe our experience in exploiting different cloud-based environments for an actual use case taken from the bioinformatics domain - the molecular surfaces analysis - that identifies similarities and possible complementarities in the protein surfaces. The analysis of macromolecular surfaces is important since protein surface conformations drive many biological reactions. We developed a workflow that performs the macromolecular surfaces analysis and provides interesting results from a scientific point of view. An important issue is represented by the fact that it is highly compute-intensive, therefore it cannot be run on a single CPU system for meaningful use cases and a parallel infrastructure is required to obtain reasonable execution time. For a decade grid infrastructures have represented suitable solutions to achieve cost effective computational power for Bioinformatics applications. However, these solutions do not offer an adequate customisation of the computational environment (e.g. installing databases and configuring virtual network) due to the rigid organisation of the storage and computational sites. Running applications on customised machines obtained by user-defined images simplifies the computing model, decreases the failure rates and therefore reduces waiting times for production analysis with respect to the canonical grid computations. For these reasons a cloud-based approach is more suitable than a pure grid paradigm. We experimented using two cloud-based approaches, based on the Worker Node On Demand Service and on OpenStack, to run the molecular surfaces analysis use case and we compared the results in terms of performance, efficiency and efforts to build the computing model with respect to grid computing.
2017
Istituto di Matematica Applicata e Tecnologie Informatiche - IMATI -
Istituto di Tecnologie Biomediche - ITB
Cloud computing
Protein surface matching
File in questo prodotto:
File Dimensione Formato  
prod_338880-doc_125846.pdf

solo utenti autorizzati

Descrizione: Porting bioinformatics applications from grid to cloud: a macromolecular surface analysis application case study
Tipologia: Versione Editoriale (PDF)
Dimensione 1.2 MB
Formato Adobe PDF
1.2 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/304207
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact