Convergent scheduling is a general instruction scheduling framework that simplifies and facilitates the application of a multitude of arbitrary constraints and scheduling heuristics required to schedule instructions for modern complex processors. A convergent scheduler is composed of independent passes, each implementing a heuristic that addresses a particular problem or constraint. The passes share a simple, common interface that allows the spatial and temporal preferences associated with each instruction to be queried and modified. With each heuristic independently applying its scheduling constraint in succession, the final result is a well formed instruction schedule that is able to satisfy most of the constraints. We have implemented a set of different passes that addresses scheduling constraints such as partitioning, load balancing, communication bandwidth, and register pressure. By applying a hand-selected, fixed ordering of the passes we are able to obtain an average increase in speedup on a reference 4-cluster VLIW architecture of 28% when compared to Desoli's PCC algorithm, 14% when compared to UAS, and a speedup of 21% over the existing space-time scheduler of the Raw processor. Then, we applied machine-learning techniques to automatically search for good pass orderings, when moving to different VLIW architectures. The architecture-specific pass orderings yield speedups ranging from 12% to 95% over the baseline order. The {em cross validation} studies we ran show that our automatically generated orderings perform well beyond the benchmarks on which they were `trained': benchmarks that were not in the training set are within 6% of the performance they would obtain had they been in the training set.

Convergent Scheduling

Puppin D;
2004

Abstract

Convergent scheduling is a general instruction scheduling framework that simplifies and facilitates the application of a multitude of arbitrary constraints and scheduling heuristics required to schedule instructions for modern complex processors. A convergent scheduler is composed of independent passes, each implementing a heuristic that addresses a particular problem or constraint. The passes share a simple, common interface that allows the spatial and temporal preferences associated with each instruction to be queried and modified. With each heuristic independently applying its scheduling constraint in succession, the final result is a well formed instruction schedule that is able to satisfy most of the constraints. We have implemented a set of different passes that addresses scheduling constraints such as partitioning, load balancing, communication bandwidth, and register pressure. By applying a hand-selected, fixed ordering of the passes we are able to obtain an average increase in speedup on a reference 4-cluster VLIW architecture of 28% when compared to Desoli's PCC algorithm, 14% when compared to UAS, and a speedup of 21% over the existing space-time scheduler of the Raw processor. Then, we applied machine-learning techniques to automatically search for good pass orderings, when moving to different VLIW architectures. The architecture-specific pass orderings yield speedups ranging from 12% to 95% over the baseline order. The {em cross validation} studies we ran show that our automatically generated orderings perform well beyond the benchmarks on which they were `trained': benchmarks that were not in the training set are within 6% of the performance they would obtain had they been in the training set.
2004
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Compiler
VLIW
File in questo prodotto:
File Dimensione Formato  
prod_160674-doc_125433.pdf

accesso aperto

Descrizione: Convergent Scheduling
Dimensione 499.73 kB
Formato Adobe PDF
499.73 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/152183
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact