We efficiently map a priority queue on the hypercube architecture in a load balanced manner, with no additional communication overhead. Two implementations for insert and deletemin operations are proposed on the single-port hypercube model. In a b-bandwidth, n-item priority queue in which every node contains b items in sorted order, the first implementation achieves optimal speed-up of O[min{log n, b(log n)/(log b+log log n)}] for inserting b pre-sorted items or deleting b smallest items, where b=O(n/sup 1/c/) with c<1. In particular, single insertion and deletion operations are cost-optimal and require O(log n/p+log p) time using O(log n/log log n) processors. The second implementation is more scalable since it uses a larger number of processors, and attains a 'nearly' optimal speed-up on the single-port hypercube. The insertion of log n pre-sorted items or the deletion of log n smallest items requires O(log log n)/sup 2/ time and O(log/sup 2/ n/log log n) processors. However, on the slightly more powerful pipelined hypercube model, we are able to reduce the time complexity to O(log log n) thus attaining optimal speed-up. To the best of our knowledge, our algorithms provide the first implementations of b-bandwidth distributed priority queues, which are load balanced and yet guarantee optimal speed-up.
Distributed priority queues on hypercube architectures
1996
Abstract
We efficiently map a priority queue on the hypercube architecture in a load balanced manner, with no additional communication overhead. Two implementations for insert and deletemin operations are proposed on the single-port hypercube model. In a b-bandwidth, n-item priority queue in which every node contains b items in sorted order, the first implementation achieves optimal speed-up of O[min{log n, b(log n)/(log b+log log n)}] for inserting b pre-sorted items or deleting b smallest items, where b=O(n/sup 1/c/) with c<1. In particular, single insertion and deletion operations are cost-optimal and require O(log n/p+log p) time using O(log n/log log n) processors. The second implementation is more scalable since it uses a larger number of processors, and attains a 'nearly' optimal speed-up on the single-port hypercube. The insertion of log n pre-sorted items or the deletion of log n smallest items requires O(log log n)/sup 2/ time and O(log/sup 2/ n/log log n) processors. However, on the slightly more powerful pipelined hypercube model, we are able to reduce the time complexity to O(log log n) thus attaining optimal speed-up. To the best of our knowledge, our algorithms provide the first implementations of b-bandwidth distributed priority queues, which are load balanced and yet guarantee optimal speed-up.File | Dimensione | Formato | |
---|---|---|---|
prod_410161-doc_144320.pdf
solo utenti autorizzati
Descrizione: Distributed priority queues on hypercube architectures
Tipologia:
Versione Editoriale (PDF)
Dimensione
855.54 kB
Formato
Adobe PDF
|
855.54 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.