This work presents GALÆXI as a novel, energy-efficient flow solver for the simulation of compressible flows on unstructured hexahedral meshes leveraging the parallel computing power of modern Graphics Processing Units (GPUs). GALÆXI implements the high-order Discontinuous Galerkin Spectral Element Method (DGSEM) using shock capturing with a finite-volume subcell approach to ensure the stability of the high-order scheme near shocks. This work provides details on the general code design, the parallelization strategy, and the implementation approach for the compute kernels with a focus on the element local mappings between volume and surface data due to the unstructured mesh. The scheme is implemented using a pure distributed memory parallelization based on a domain decomposition, where each GPU handles a distinct region of the computational domain. On each GPU, the computations are assigned to different compute streams which allows to antedate the computation of quantities required for communication while performing local computations from other streams to hide the communication latency. This parallelization strategy allows for maximizing the use of available computational resources. This results in excellent strong scaling properties of GALÆXI up to 1024 GPUs if each GPU is assigned a minimum of one million degrees of freedom. To verify its implementation, a convergence study is performed that recovers the theoretical order of convergence of the implemented numerical schemes. Moreover, the solver is validated using both the incompressible and compressible formulation of the Taylor–Green-Vortex at a Mach number of 0.1 and 1.25, respectively. A mesh convergence study shows that the results converge to the high-fidelity reference solution and that the results match the original CPU implementation. Finally, GALÆXI is applied to a large-scale wall-resolved large eddy simulation of a linear cascade of the NASA Rotor 37. Here, the supersonic region and shocks at the leading edge are captured accurately and robustly by the implemented shock-capturing approach. It is demonstrated that GALÆXI requires less than half of the energy to carry out this simulation in comparison to the reference CPU implementation. This renders GALÆXI as a potent tool for accurate and efficient simulations of compressible flows in the realm of exascale computing and the associated new HPC architectures.

, , , , ,
doi.org/10.1016/j.cpc.2024.109388
Computer Physics Communications
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Kurz, M. (Marius), Kempf, D. (Daniel), Blind, M.P. (Marcel P.), Kopper, P. (Patrick), Offenhäuser, P. (Philipp), Schwarz, A. (Anna), … Beck, A. (Andrea). (2025). GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based systems. Computer Physics Communications, 306. doi:10.1016/j.cpc.2024.109388