In this technical report we analyse a methodology to integrate the Apache Spark Framework into a TORQUE HPC batch environment. We will show the pro and cons to integrate SPARK within TORQUE, underlining the compatibility issues and showing how to overcome the most common problems. We will also give the main technical details about the use of the aforementioned architecture. In the experimental assessment we also take into account the BLAS library advantages and we describe how to install/configure them properly to correctly run within the architecture. Finally, we will show the empirical results using a very challenge use case, namely the realization of a document search engine using the whole PubMed literature, aiming at the evaluation of the performances of this kind of architecture.
Integration and performances of Spark on a PBS-based HPC environment
Francesco Gargiulo;Stefano Silvestri;Gennaro Oliva;Mario Ciampi
2018
Abstract
In this technical report we analyse a methodology to integrate the Apache Spark Framework into a TORQUE HPC batch environment. We will show the pro and cons to integrate SPARK within TORQUE, underlining the compatibility issues and showing how to overcome the most common problems. We will also give the main technical details about the use of the aforementioned architecture. In the experimental assessment we also take into account the BLAS library advantages and we describe how to install/configure them properly to correctly run within the architecture. Finally, we will show the empirical results using a very challenge use case, namely the realization of a document search engine using the whole PubMed literature, aiming at the evaluation of the performances of this kind of architecture.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.