Towards a multi-node OpenACC Implementation of the ICON Model; William Sawyer (Swiss National Supercomputing Centre), Guenther Zaengl (German Weather Service, DWD), Leonidas Linardakis (Max Planck Instititue for Meteorology, MPI-M), Markus Wetzstein (Swiss National Computing Centre), Christian Conti (Swiss Federal Institute of Technology, ETH)
We have ported the Icosahedral Non-hydrostatic (ICON) model's dynamics solver to Graphical Processing Units (GPUs), which is a task within the Partnership for Advanced Computing in Europe (PRACE) Second Implementation Phase (2IP) Work Package 8 (WP8). Initial single-node OpenCL and CUDA Fortran implementations of ICON's non-hydrostatic dynamical core (NHDC) resulted in a maximum factor of two speedup over the latest CPU nodes, e.g., a dual-socket Intel Sandybridge. While this performance was promising, ICON developers viewed neither OpenCL nor CUDA Fortran as viable programming paradigms for the actual production code. They suggested instead the OpenACC standard -- compiler directives which are only utilized in an accelerator setting, and thus are minimally intrusive to the existing CPU-targeted code -- as the proper paradigm for the multi-node GPU implementation, which was then undertaken in the second year of WP8.
We will present the results of the multi-node OpenACC implementation of the ICON NHDC for hybrid multicore platforms. The code baseline is the ICON "DSL" (Domain Specific Language) testbed code, which is essentially a stripped-down version of the ICON model for dynamics simulations only. We will discuss the OpenACC directives used for the port of the computational as well as the communication code to GPUs, and report the resulting GPU performance on NVIDIA K20x as compared to contemporary CPU architectures.
In addition, the roadmap for an accelerated ICON full model will be presented. As a first step, we are now incorporating the OpenACC directives into the ICON NHDC in development trunk, based on the feedback given to us from the ICON developers at the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). This effort has revealed deficiencies in the OpenACC standard, which have been reported to the standards committee, in particular the insufficient support of Fortran derived types.
Finally, we have concrete plans to incorporate accelerator-capable components into ICON, such as the Rapid Radiative Transfer Model (RRTM), which have already been ported to GPUs by other groups. Moreover, we have initiated the port of other ICON Climate physical parameterizations stemming from the ECHAM and COSMO models to OpenACC. This step should enable ICON for certain scientific configurations to run on many-core platforms which support OpenACC. We will discuss planned efforts to port ICON extensions to accelerators, such as the HAMMOZ module for aerosol-cloud interactions and atmospheric chemistry. The resulting accelerated components should benefit climate researchers world-wide who plan to transition from ECHAM to ICON in the coming years.