We present high-performance implementations of the two-dimensional Ising and Blume-Capel models for large-scale, multi-GPU simulations. Our approach takes full advantage of the NVIDIA GB200 NVL72 system, which features up to 72 GPUs interconnected via high-bandwidth NVLink, enabling direct GPU-to-GPU memory access across multiple nodes. By utilizing Fabric Memory and an optimized Monte Carlo kernel for the Ising model, our implementation supports simulations of systems with linear sizes up to L=223, corresponding to approximately 70 trillion spins. This allows for a peak processing rate of nearly 1.15×105 lattice updates per nanosecond—setting a new performance benchmark for Ising model simulations. Additionally, we introduce a custom protocol for computing correlation functions, which strikes an optimal balance between computational efficiency and statistical accuracy. This protocol enables large-scale simulations without incurring prohibitive runtime costs. Benchmark results show near-perfect strong and weak scaling up to 64 GPUs, demonstrating the effectiveness of our approach for large-scale statistical physics simulations. Program summary: Program title: cuIsing (optimized) CPC Library link to program files: https://doi.org/10.17632/ppkwwmcpwg.1 Licensing provisions: MIT license Programming languages: CUDA C Nature of problem: Comparative studies of the critical dynamics of the Ising and Blume-Capel models are essential for gaining deeper insights into phase transitions, enhancing computational methods, and developing more accurate models for complex physical systems. To minimize finite-size effects and optimize the statistical quality of simulations, large-scale simulations over extended time scales are necessary. To support this, we provide two high-performance codes capable of running simulations with up to 70 trillion spins. Solution method: We present updated versions of our multi-GPU code for Monte Carlo simulations, implementing both the Ising and Blume-Capel models. These codes take full advantage of multi-node NVLink systems, such as the NVIDIA GB200 NVL72, enabling scaling across GPUs connected across different nodes within the same NVLink domain. Communication between GPUs is handled seamlessly via Fabric Memory–a novel memory allocation technique that facilitates direct memory access between GPUs within the same domain, eliminating the need for explicit data transfers. By employing highly optimized CUDA kernels for the Metropolis algorithm and a custom protocol that reduces the computational overhead of the correlation function, our implementation achieves the highest recorded performance to date.

Massive-scale simulations of 2D Ising and Blume-Capel models on rack-scale multi-GPU systems

Bisson, Mauro;Bernaschi, Massimo;Fatica, Massimiliano;Gonzalez-Adalid Pemartin, Isidoro;
2025

Abstract

We present high-performance implementations of the two-dimensional Ising and Blume-Capel models for large-scale, multi-GPU simulations. Our approach takes full advantage of the NVIDIA GB200 NVL72 system, which features up to 72 GPUs interconnected via high-bandwidth NVLink, enabling direct GPU-to-GPU memory access across multiple nodes. By utilizing Fabric Memory and an optimized Monte Carlo kernel for the Ising model, our implementation supports simulations of systems with linear sizes up to L=223, corresponding to approximately 70 trillion spins. This allows for a peak processing rate of nearly 1.15×105 lattice updates per nanosecond—setting a new performance benchmark for Ising model simulations. Additionally, we introduce a custom protocol for computing correlation functions, which strikes an optimal balance between computational efficiency and statistical accuracy. This protocol enables large-scale simulations without incurring prohibitive runtime costs. Benchmark results show near-perfect strong and weak scaling up to 64 GPUs, demonstrating the effectiveness of our approach for large-scale statistical physics simulations. Program summary: Program title: cuIsing (optimized) CPC Library link to program files: https://doi.org/10.17632/ppkwwmcpwg.1 Licensing provisions: MIT license Programming languages: CUDA C Nature of problem: Comparative studies of the critical dynamics of the Ising and Blume-Capel models are essential for gaining deeper insights into phase transitions, enhancing computational methods, and developing more accurate models for complex physical systems. To minimize finite-size effects and optimize the statistical quality of simulations, large-scale simulations over extended time scales are necessary. To support this, we provide two high-performance codes capable of running simulations with up to 70 trillion spins. Solution method: We present updated versions of our multi-GPU code for Monte Carlo simulations, implementing both the Ising and Blume-Capel models. These codes take full advantage of multi-node NVLink systems, such as the NVIDIA GB200 NVL72, enabling scaling across GPUs connected across different nodes within the same NVLink domain. Communication between GPUs is handled seamlessly via Fabric Memory–a novel memory allocation technique that facilitates direct memory access between GPUs within the same domain, eliminating the need for explicit data transfers. By employing highly optimized CUDA kernels for the Metropolis algorithm and a custom protocol that reduces the computational overhead of the correlation function, our implementation achieves the highest recorded performance to date.
2025
Istituto Applicazioni del Calcolo ''Mauro Picone''
Monte Carlo Simulation, CUDA C, Massive-Scale simulations
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/552888
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact