Over the past five years, graphics processing units (GPUs) have had a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations in nuclear and particle physics. While GPUs have been applied with great success to the post-Monte Carlo analysis phase which accounts for a substantial fraction of the workload in a typical LQCD calculation, the initial Monte Carlo gauge field generation phase requires capability-level supercomputing, corresponding to O(100) GPUs or more. Such strong scaling has not been previously achieved. In this talk I demonstrate that using a multi-dimensional parallelization strategy and a domain-decomposed preconditioner allows us to scale into this regime. I present results for two popular discretizations of the Dirac operator, Wilson-clover and improved staggered, employing up to 256 GPUs on the Edge cluster at Lawrence Livermore National Laboratory.
The Particle Physics Theory seminar is a weekly series of talks reflecting the diverse interests of the group. Topics include analytic and numerical calculations based on the Standard Model of elementary particle physics, theories exploring new physics, as well as more formal developments in gauge theories and gravity..