This paper by Anderson et al at Caltech describes a method to use GPUs to accelerate Quantum Monte Carlo on a GPU. QMC is among the most accurate (and expensive) methods in the quantum chemistry zoo. Primarily, this involves the investigation of tricks available to this algorithm to speed up matrix multiplication. That is, as a statistical algorithm, the authors studied the performance enhancements available when multiplying many matrices simultaneously. Additionally, the paper explores the Kahan Summation Formula to improve the accuracy of GPU matrix multiplication. (Quantum Monte Carlo on Graphical Processing Units. Amos G. Anderson, William A Goddard III, Peter Schroder. Computer Physics Communications)
Quantum Monte Carlo on GPUs
September 10th, 2007Graphic processors to speed-up simulations for the design of high performance solar receptors
September 4th, 2007This paper by Collange et al. at Université de Perpignan, France, decribes a prototype to be integrated into simulation codes that estimate temperature, velocity and pressure to design next generation solar receptors. Such codes delegate to GPUs the computation of heat transfer due to radiation. The authors use Monte-Carlo line-by-line ray-tracing through finite volumes. This means data-parallel arithmetic transformations on large data structures. The performance on two recent graphics cards (Nvidia 7800GTX and ATI RX1800XL) show speedups higher than 400 compared to CPU implementations leaving most of CPU computing resources available. As there were some questions pending about the accuracy of the operators implemented in GPUs, the authors start this report with a survey and some contributed tests on the various floating point units available on GPUs. (Graphic processors to speed-up simulations for the design of high performance solar receptors. S. Collange, M. Daumas, D. Defour. Proceedings of the IEEE 18th International Conference on Application-specific Systems, Architectures and Processors.)
Two-electron Integral Evaluation on the Graphics Processor Unit
August 16th, 2007Abstract: We propose the algorithm to evaluate the Coulomb potential in the ab initio density functional calculation on the graphics processor unit (GPU). The numerical accuracy required for the algorithm is investigated in detail. It is shown that GPU, which supports only the single-precision floating number natively, can take part in the major computational tasks. Because of the limited size of the working memory, the Gauss-Rys quadrature to evaluate the electron repulsion integrals (ERIs) is investigated in detail. The error analysis of the quadrature is performed. New interpolation formula of the roots and weights is presented, which is suitable for the processor of the single-instruction multiple-data type. It is proposed to calculate only small ERIs on GPU. ERIs can be classified efficiently with the upper-bound formula. The algorithm is implemented on NVIDIA GeForce 8800 GTX and the Gaussian 03 program suite. It is applied to the test molecules Taxol and Valinomycin. The total energies calculated are essentially the same as the reference ones. The preliminary results show the considerable speedup over the commodity microprocessor. (Two-electron integral evaluation on the graphics processor unit. Koji Yasuda. Journal of Computational Chemistry. July 5, 2007.)
Accelerating molecular modeling applications with graphics processors
August 11th, 2007In this paper, an overview of recent advances in programmable GPUs is presented, with an emphasis on their application to molecular mechanics simulations and the programming techniques required to obtain optimal performance in these cases. We demonstrate the use of GPUs for the calculation of long-range electrostatics and nonbonded forces for molecular dynamics simulations. The application of GPU acceleration to biomolecular simulation is also demonstrated through the use of GPU-accelerated Coulomb-based ion placement and calculation of time-averaged potentials from molecular dynamics trajectories. A novel approximation to Coulomb potential calculation, the multilevel summation method, is introduced and compared to direct Coulomb summation. In light of the performance obtained for this set of calculations, future applications of graphics processors to molecular dynamics simulations are discussed. (Accelerating molecular modeling applications with graphics processors, John E. Stone, James C. Phillips, Peter L. Freddolino, David J. Hardy, Leonardo G. Trabuco, and Klaus Schulten. Journal of Computational Chemistry (In press))
Lattice QCD as a video game (GPGPU for quantum field theory)
July 14th, 2007This paper outlines how GPGPU techniques can be used for Monte Carlo simulations of quantum field theories such as QCD. The speedup is around a factor of 4-10 depending on the GPU model relative to SSE optimized code on a Pentium 4. Sample code is also given. (Lattice QCD as a video game)
Real-Time Particle Level Sets with Application to Flow Visualization
May 24th, 2007This technical report by N. Cuntz, R. Strzodka and A. Kolb describes a particle level set (PLS) system for fast and accurate surface tracking on the GPU. The technique demonstrates the coupling of grid and particle information by using vertex/fragment buffer objects, shaders and blending functionality in an innovative way. Improvements over the original PLS technique include a sub-voxel interface representation and a more accurate level set correction using more precise particle radii. As a concrete application the authors demonstrate that their fast and accurate PLS is well suited to the visualization of dynamic flows. An accurate evolution of time surfaces and representation of path volumes offer a more reliable basis for data interpretation. (Real-Time Particle Level Sets with Application to Flow Visualization. Technical report, 2007)
Radio Wave Propagation on Graphics Hardware
April 25th, 2007Radio wave propagation predictions are of great interest for cellular radio networks. Ray tracing approaches are an established technique for wave propagation, however, such approaches need to be extended to include diffraction, which is a predominant effect for common mobile radio frequencies. We demonstrate how to exploit the GPU to accelerate wave propagation predictions by orders of magnitude, making them available at interactive frame rates. The paper presents a GPU implementation of our diffraction technique. The presented technique can be easily extended to also simulate the diffraction of water waves by obstacles in complex three dimensional scenarios in a physically correct manner. (Fast Edge-Diffraction-Based Radio Wave Propagation Model for Graphics Hardware. Tobias Rick, Rudolf Mathar, Proceedings of ITG INICA 2007)
Modal Fourier wavefront reconstruction using GPUs
April 24th, 2007This work approaches the fundamental problem of accelerating FFT computation by use of GPUs, in order to apply it to Adaptive Optics, the key for obtaining the maximum performance from projected ground-based eXtremely Large Telescopes. A method to efficiently adapt the FFT for the underlying architecture of GPUs is given. The authors derive a novel FFT method that alternates base-2 and base-4 decomposition of the bidimensional domain to take the most from Multiple Render Target extension as they elaborate a very unusual Pease 8-data “butterfly”. (Modal Fourier wavefront reconstruction using GPUs J.G. Marichal-Hernandez, J.M. Rodriguez-Ramos, F. Rosa. La Laguna University. To appear in Journal of Electronic Imaging.)
Native, emulated and mixed precision schemes
March 13th, 2007This survey paper by D. Göddeke and R. Strzodka compares native double precision solvers for linear systems of equations as they typically arise in finite element discretizations with emulated- and mixed-precision schemes. Such schemes are particularly suitable for coupled hardware configurations such as GPUs and FPGAs, which serve as co-processors to the general purpose CPU. The results demonstrate that
- accuracy is preserved even for very ill-conditioned systems,
- significant speedups can be achieved (time aspect, GPUs) and
- area requirements are reduced (space aspect, FPGA).
Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards
May 24th, 2004This paper explores the plausibility of using the GPU for numerical simulations on structured grids (lattices). The paper (1) reviews previous work on using GPUs for non-graphics applications, (2) implements probability-based simulations on the GPU, namely the Ising and percolation models, (3) implements vector operation benchmarks for the GPU, and (4) compares CPU and GPU performance. The original contribution of this work is implementing Monte Carlo type simulations on the GPU. Such simulations have a wide area of applications. They are computationally intensive and, as shown in the paper, lend themselves naturally to implementation on GPUs, providing a computational speedup. A general conclusion from the results obtained is that moving computations from the CPU to the GPU is feasible, yielding good time and price performance for certain lattice computations. Preliminary results also show that it is feasible to use GPUs in parallel. (S.Tomov, M.McGuigan, R.Bennett, G.Smith, J.Spiletic. Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards, to appear in Computers & Graphics.)