General-Purpose Computation Using Graphics Hardware
GPGPU stands for General-Purpose computation on GPUs. With the increasing programmability of commodity graphics processing units (GPUs), these chips are capable of performing more than the specific graphics computations for which they were designed. They are now capable coprocessors, and their high speed makes them useful for a variety of applications. The goal of this page is to catalog the current and historical use of GPUs for general-purpose computation.
Have some GPGPU News to Contribute? Submit it!
Geomerics, a new R&D company based in Cambridge UK, have recently announced a real-time radiosity simulation running entirely on the GPU. The solution runs at up to 100hz on common graphics hardware and allows for fully dynamic lighting, including spot-lights, projected texture or video lighting, and area lights. It integrates well with traditional modeling techniques such as normal mapping, and all lighting is performed in high dynamic range. Videos, screenshots and further details of the simulation can be found on their website.
The irradiance caching algorithm is commonly used for fast global illumination since it provides high-quality rendering in a reasonable time. However this algorithm relies on a spatial data structure along with complex algorithms. This central and permanently modified data structure prevents this algorithm from being easily implemented on GPUs. This paper proposes a novel approach to global illumination using irradiance and radiance cache: the Radiance Cache Splatting. This method directly meets the processing constraints of graphics hardware since it avoids the need of complex data structure and algorithms. Moreover, the rendering quality remains identical to classical irradiance and radiance caching. This work will be presented at the Eurographics Symposium on Rendering 2005, and during SIGGRAPH 2005 sketches. (Radiance Cache Splatting: A GPU-Friendly GLobal Illumination Algorithm . Pascal Gautron, Jaroslav Krivanek, Kadi Bouatouch, Sumanta Pattanaik. Proceedings of Eurographics Symposium on Rendering 2005)
Caustics are complex patterns of shimmering light formed due to reflective and refractive objects; for example, those formed on the floor of a swimming pool. Caustics Mapping is a physically based real-time caustics rendering algorithm. It utilizes the concept of backward ray-tracing, however it involves no expensive computations that are generally associated with ray-tracing and other such techniques. The main advantage of caustics mapping is that it is extremely practical for games and other interactive applications because of its high frame rates. Furthermore, the algorithm runs entirely on graphics hardware, which leaves the CPU free for other computation. There is no pre-computation involved, and therefore fully dynamic geometry, lighting, and viewing directions are supported. In addition, there is no limitation on the topology of the reciever geometry, i.e., caustics can be formed on arbitrary surfaces. (Caustics Mapping: An Image-space Technique for Real-time Caustics. Musawir A. Shah and Sumanta Pattanaik. Technical Report, School of Engineering and Computer Science, University of Central Florida, CS TR 50-07, 07/29/2005 (Submitted for Publication))
Using the GPU to accelerate ray tracing may seem like a natural choice due to the highly parallel nature of the problem. However, determining the most versatile GPU data structure for scene storage and traversal is a challenge. In this paper, we introduce a new method for quick intersection of triangular meshes on the GPU. The method uses a threaded bounding volume hierarchy built from a geometry image, which can be efficiently traversed and constructed entirely on the GPU. This acceleration scheme is highly competitive with other GPU ray tracing methods, while allowing for both dynamic geometry and an efficient level of detail scheme at no extra cost. (Fast GPU Ray Tracing of Dynamic Meshes using Geometry Images Nathan A. Carr, Jared Hoberock, Keenan Crane, and John C. Hart. To appear in Proceedings of Graphics Interface 2006)
Recently, ray tracing on consumer level graphics hardware has been introduced. So far, most published studies on this topic use the uniform grid spatial subdivision structure for reducing the number of ray/triangle intersection tests. For many types of scenes, a hierarchical acceleration structure is more appropriate. This thesis by Lars Ole Simonsen and Niels Thrane of University of Aarhus compares GPU based traversal of kd-trees and uniform grids with a novel bounding volume hierarchy traversal scheme. The three implementations are compared in terms of performance and usefulness on the GPU. The thesis concludes that on the GPU, the bounding volume hierarchy traversal technique is up to 9 times faster than its implementations of uniform grid and kd-tree. Additionally, this technique proves the simplest to implement and the most memory efficient. (Lars Ole's Website or Direct link to thesis PDF.)
This dissertation by Tim Purcell of Stanford University discusses several topics relevant to GPGPU including a stream processor abstraction for GPUs, and GPU-based ray tracing and photon mapping algorithms. Much of this work has been reported on GPGPU before, but the description of the ray tracing work in particular is expanded and updated from previous papers with details about the Radeon 9700 ray tracer demonstrated at Siggraph 2002. Included on the web page are links to the dissertation defense talk slides and movies of the various demos. (Ray Tracing on a Stream Processor, Timothy J. Purcell, Ph.D. Dissertation, March 2004.)
Fantasy Lab, a game developer located in the San Francisco Bay area, has announced its new game engine, which includes support for real-time global illumination and displacement-mapped subdivision surfaces. Videos on the company's website show global illumination on an animated subdivision-surface-based character. The global illumination solution for the videos is calculated in 3.3 milliseconds per frame (300 frames per second) on an NVIDIA GeForce Go 7900 GTX (a laptop GPU).
This paper by Carr et al. describes a method for computing subsurface scattering on the GPU. They use a multi-resolution meshed atlas and modern GPU programmability to devise a real-time GPU algorithm that can render semi-transparent objects with diffuse subsurface-scattered illumination under dynamic lighting and viewing conditions. (GPU Algorithms for Radiosity and Subsurface Scattering. Nathan A. Carr, Jesse D. Hall, and John C. Hart. Proceedings of Graphics Hardware 2003, July 2003.)
This tutorial explains how global illumination rendering methods can be implemented on Shader Model 3.0 GPUs. These algorithms do not follow the conventional local illumination model of DirectX/OpenGL pipelines, but require global geometric or illumination information when shading a point. In addition to the theory and state of the art of these approaches, the tutorial goes into the details of a few algorithms, including mirror reflection, refraction, caustics, diffuse/glossy indirect illumination, precomputation-aided global illumination for surface and volumetric models, obscurances and tone mapping, also giving their GPU implementation in HLSL or Cg language. (Laszlo Szirmay-Kalos, Laszlo Scecsi, Mateu Sbert: GPUGI: Global Illumination Effects on the GPU. Eurographics 2006 Tutorial.)
We received news simultaneously from the developers of two new GPU ray tracers. Both projects are graduate-level thesis projects. One, called GPU-RT, is developed by Martin Christen supports .3DS format meshes, multiple materials, and implements acceleration data structures. GPU-RT runs on NVIDIA GeForce 6 Series GPUs under D3D/HLSL and OpenGL/GLSL, and is available on SourceForge.net. The other project, "Ray Tracing on Programmable Graphics Hardware", is by Filip Karlsson and Carl Johan Ljungstedt of Chalmers University of Technology. The thesis describes, among other things, how proximity clouds can be used to accelerate ray tracing on the GPU. (1. GPU-RT, Diploma Thesis by Martin Christen. 2. "Ray Tracing on Programmable Graphics Hardware", Masters Thesis by Filip Karlsson and Carl Johan Ljungstedt.)
This paper by Larsen et al. at Technical University of Denmark introduces a fast GPU accelerated technique for simulating photon mapping. Each of the steps in the photon mapping algorithm are executed either on the CPU or the GPU depending on which of the processors are most appropriate for the task. The indirect illumination is calculated using a new GPU accelerated final gathering method. Caustic photons are traced on the CPU and then drawn using points in the framebuffer, and finally filtered using the GPU. Both diffuse and non-diffuse surfaces are handled by calculating the direct illumination on the GPU and the photon tracing on the CPU. (Simulating Photon Mapping for Real-time Applcations. Bent D. Larsen, Niels J. Christensen, To appear at Eurographics Symposium on Rendering, 2004.)
This report by Coombe et al. describes a technique for computing radiosity, including an adaptive subdivision of the model, using graphics hardware. The technique uses floating point textures and fragment programs to perform progressive refinement using a novel implementation of hemicube radiosity on the GPU. (Radiosity on Graphics Hardware. Greg Coombe, Mark J. Harris, Anselmo Lastra. UNC TR03-020. June, 2003.)
This paper by Purcell et al. presents a photon mapping algorithm that runs entirely on the GPU. The paper presents details for tracing photons, building the photon map, and computing the radiance estimate at each pixel using a k-nearest neighbor search.(Photon Mapping on Programmable Graphics Hardware. Timothy J. Purcell, Craig Donner, Mike Cammarano, Henrik Wann Jensen, and Pat Hanrahan. Proceedings of Graphics Hardware 2003, July 2003.)