We present a hardware-algorithm for sorting N elements using either a p-sorter or a sorting network of fixed I/O size p while strictly enforcing conflict-free memory accesses. To the best of our knowledge, this is the first realistic design that achieves optimal time performance, running in Theta (N log N/p log p) time for all ranges of N. Our result completely resolves the problem of designing an implementable, time-optimal algorithm for sorting N elements using a p-sorter. More importantly, however, our result shows that, in order to achieve optimal time performance, ail that is needed is a sorting network of depth O(log(2) p) such as, for example, Batcher's classic bitonic sorting network.
An optimal hardware-algorithm for sorting using a fixed-size parallel sorting device
2000
Abstract
We present a hardware-algorithm for sorting N elements using either a p-sorter or a sorting network of fixed I/O size p while strictly enforcing conflict-free memory accesses. To the best of our knowledge, this is the first realistic design that achieves optimal time performance, running in Theta (N log N/p log p) time for all ranges of N. Our result completely resolves the problem of designing an implementable, time-optimal algorithm for sorting N elements using a p-sorter. More importantly, however, our result shows that, in order to achieve optimal time performance, ail that is needed is a sorting network of depth O(log(2) p) such as, for example, Batcher's classic bitonic sorting network.File | Dimensione | Formato | |
---|---|---|---|
prod_406160-doc_142011.pdf
solo utenti autorizzati
Descrizione: An optimal hardware-algorithm for sorting using a fixed-size parallel sorting device
Tipologia:
Versione Editoriale (PDF)
Dimensione
275.25 kB
Formato
Adobe PDF
|
275.25 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.