CNR Institutional Research Information System

Scientific image processing is a topic of interest for a broad scientific community since it is a mean of gaining understanding and insight into the data for a growing number of applications. Furthermore, the technological evolution permits large data acquisition, with sophisticated instruments, and their elaboration through complex multidisciplinary applications, resulting in datasets that are growing at an extremely rapid pace. This results in the need of huge computational power for the processing. It is necessary to move towards High Performance Computing (HPC) and to develop proper parallel implementations of image processing algorithms/operations. Modern HPC resources are typically highly heterogeneous systems, composed of multiple CPUs and accelerators such as Graphics Processing Units (GPUs) and Field-Programmable Gate Arrays (FPGAs). The actual barrier posed by heterogeneous HPC resources is the development and/or the performance efficient porting of software on such complex architectures. In this context, the aim of this work is to enable image processing on cluster of CPUs, through the use of PIMA(GE)(2) Lib, the Parallel IMAGE processing GEnoa Library. The library is able to exploit traditional clusters through MPI, GPU device through CUDA and a first experimentation is aimed to explore the use of CPU-clusters. Library operations are provided to the users through a sequential interface defined to hide the parallelism of the computation. The parallel computation, at each level, is managed employing specific policies designed to suitably coordinate the parallel processes/threads involved in the elaboration and their use is tightly coupled with the PIMA(GE)2 Lib interface. In this paper, we present the incremental approach adopted in the development of the library and the performance gains in each implementations: quite linear speedup is achieved on cluster architecture, about a 30% improvement in the execution time on a single GPU and the first results on cluster of GPUs are promising. (C) 2014 Elsevier B.V. All rights reserved.

An MPI-CUDA library for image processing on HPC architectures

A Galizia;D D'Agostino;A Clematis

2015

Abstract

Scientific image processing is a topic of interest for a broad scientific community since it is a mean of gaining understanding and insight into the data for a growing number of applications. Furthermore, the technological evolution permits large data acquisition, with sophisticated instruments, and their elaboration through complex multidisciplinary applications, resulting in datasets that are growing at an extremely rapid pace. This results in the need of huge computational power for the processing. It is necessary to move towards High Performance Computing (HPC) and to develop proper parallel implementations of image processing algorithms/operations. Modern HPC resources are typically highly heterogeneous systems, composed of multiple CPUs and accelerators such as Graphics Processing Units (GPUs) and Field-Programmable Gate Arrays (FPGAs). The actual barrier posed by heterogeneous HPC resources is the development and/or the performance efficient porting of software on such complex architectures. In this context, the aim of this work is to enable image processing on cluster of CPUs, through the use of PIMA(GE)(2) Lib, the Parallel IMAGE processing GEnoa Library. The library is able to exploit traditional clusters through MPI, GPU device through CUDA and a first experimentation is aimed to explore the use of CPU-clusters. Library operations are provided to the users through a sequential interface defined to hide the parallelism of the computation. The parallel computation, at each level, is managed employing specific policies designed to suitably coordinate the parallel processes/threads involved in the elaboration and their use is tightly coupled with the PIMA(GE)2 Lib interface. In this paper, we present the incremental approach adopted in the development of the library and the performance gains in each implementations: quite linear speedup is achieved on cluster architecture, about a 30% improvement in the execution time on a single GPU and the first results on cluster of GPUs are promising. (C) 2014 Elsevier B.V. All rights reserved.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2015
			
	Strutture organizzative
	
				Istituto di Matematica Applicata e Tecnologie Informatiche - IMATI -
			
	Parole chiave
	
				Image processing
Parallel computing
GPU
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_330147-doc_162090.pdf solo utenti autorizzati Descrizione: An MPI-CUDA library for image processing on HPC architectures Tipologia: Versione Editoriale (PDF) Dimensione 1.59 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.59 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/290640

Citazioni

ND

16

14

social impact