We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016). Program summary: Program title: cuIsing (optimized). CPC Library link to program files: http://dx.doi.org/10.17632/xrb9xtkbcp.1 Licensing provisions: MIT license. Programming languages: CUDA C, Python. Nature of problem: Two dimensional Ising model for spin systems. Solution method: Checkerboard Metropolis algorithm.

High performance implementations of the 2D Ising model on GPUs

Bernaschi M
2020

Abstract

We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016). Program summary: Program title: cuIsing (optimized). CPC Library link to program files: http://dx.doi.org/10.17632/xrb9xtkbcp.1 Licensing provisions: MIT license. Programming languages: CUDA C, Python. Nature of problem: Two dimensional Ising model for spin systems. Solution method: Checkerboard Metropolis algorithm.
2020
Istituto Applicazioni del Calcolo ''Mauro Picone''
6
5 software including parallel algorithms; 23 statistical physics and thermodynamics; Ising model; GPU programming
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/384975
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? ND
social impact