In parallel computing, a complex task is typically split among many computing resources, which are engaged to perform portions of such task in a parallel fashion. Except for a very limited class of application, computing resources need to coordinate with each other in order to carry out the parallel execution in a consistent way. As a consequence, a synchronization overhead arises, which can significantly impair the overall execution performance. Typically, the synchronization is achieved by adopting a centralized synchronization barrier involving all the computing resources. In many application domains, though, such kind of global synchronization can be relaxed and a lean synchronization schema, namely local synchronization, can be exploited. By using local synchronization, each computing resource needs to synchronize only with a subset of the other computing resources. In this work, we evaluate the performance of the local synchronization mechanism when compared to the global synchronization scenario. As a key performance indicator, the efficiency index is considered, which is the speedup normalized with respect to the number of computing nodes. The efficiency trend is evaluated both analytically and through numerical simulation. More in particular, the analytical study is carried out by exploiting extreme value theory for the case of global synchronization, whereas, the max-plus algebra theory is used in the case of local synchronization.

Improving Efficiency in Parallel Computing Leveraging Local Synchronization

Franco Cicirelli;Andrea Giordano;Carlo Mastroianni
2019

Abstract

In parallel computing, a complex task is typically split among many computing resources, which are engaged to perform portions of such task in a parallel fashion. Except for a very limited class of application, computing resources need to coordinate with each other in order to carry out the parallel execution in a consistent way. As a consequence, a synchronization overhead arises, which can significantly impair the overall execution performance. Typically, the synchronization is achieved by adopting a centralized synchronization barrier involving all the computing resources. In many application domains, though, such kind of global synchronization can be relaxed and a lean synchronization schema, namely local synchronization, can be exploited. By using local synchronization, each computing resource needs to synchronize only with a subset of the other computing resources. In this work, we evaluate the performance of the local synchronization mechanism when compared to the global synchronization scenario. As a key performance indicator, the efficiency index is considered, which is the speedup normalized with respect to the number of computing nodes. The efficiency trend is evaluated both analytically and through numerical simulation. More in particular, the analytical study is carried out by exploiting extreme value theory for the case of global synchronization, whereas, the max-plus algebra theory is used in the case of local synchronization.
2019
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Parallel Computing
Speedup
Synchronization
Max-Plus Algebra
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/360760
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact