A ScientificComputing.com article by Rob Farber explores the topic of numerical precision in the context of future exascale computing, asking the question “how do we know that anything we compute is correct?” The discussion centers around processors such as GPUs which provide both single- and double-precision computation but at different throughput levels. “Taking a multi-precision approach can enhance the accuracy of a calculation and justify the use of mainly single-precision arithmetic (for performance) along with the occasional use of double-precision (64-bit) arithmetic for precision-sensitive operations,” writes Farber. (Rob Farber. “Numerical Precision: How Much is Enough?” ScientificComputing.com. Accessed July 1, 2008.)
Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs)July 14th, 2008
In a previous publication, we have examined the fundamental difference between computational precision and result accuracy in the context of the iterative solution of linear systems as they typically arise in the Finite Element discretization of Partial Differential Equations (PDEs). In particular, we evaluated mixed- and emulated-precision schemes on commodity graphics processors (GPUs), which at that time only supported computations in single precision. With the advent of graphics cards that natively provide double precision, this report updates our previous results.
We demonstrate that with new co-processor hardware supporting native double precision, such as NVIDIA’s G200 and T10 architectures, the situation does not change qualitatively for PDEs, and the previously introduced mixed precision schemes are still preferable to double precision alone. But the schemes achieve significant quantitative performance improvements with the more powerful hardware. In particular, we demonstrate that a Multigrid scheme can accurately solve a common test problem in Finite Element settings with one million unknowns in less than 0.1 seconds, which is truely outstanding performance. We support these conclusions by exploring the algorithmic design space enlarged by the availability of double precision directly in the hardware.
(Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs). Dominik Göddeke and Robert Strzodka. Technical Report, 2008.)
FEAST is a hardware-oriented MPI-based Finite Element solver toolkit. With the extension FEASTGPU the authors have previously demonstrated that significant speed-ups in the solution of the scalar Poisson problem can be achieved by the addition of GPUs as scientific co-processors to a commodity based cluster. In this paper the authors put the more general claim to the test: Applications based on FEAST, that ran only on CPUs so far, can be successfully accelerated on a co-processor enhanced cluster without any code modifications. The chosen solid mechanics code has higher accuracy requirements and a more diverse CPU/co-processor interaction than the Poisson example, and is thus better suited to assess the practicability of the acceleration approach. The paper presents accuracy experiments, a scalability test and acceleration results for different elastic objects under load. In particular, it demonstrates in detail that the single precision execution of the co-processor does not affect the final accuracy. The paper establishes how the local acceleration gains of factors 5.5 to 9.0 translate into 1.6- to 2.6-fold total speed-up. Subsequent analysis reveals which measures will increase these factors further. (Dominik Göddeke, Hilmar Wobker, Robert Strzodka, Jamaludin Mohd-Yusof, Patrick McCormick, Stefan Turek. Co-Processor Acceleration of an Unmodified Parallel Solid Mechanics Code with FEASTGPU. International Journal of Computational Science and Engineering (to appear).)