The proceedings of the workshop “General-Purpose GPU Computing: Practice And Experience” held at SuperComputing 2006 are now posted. The proceedings include PDFs of the workshop presentations and posters. (http://www.gpgpu.org/sc2006/workshop/)
This article at HPC Wire by Matthew Papakipos, CTO of PeakStream Technologies, discusses the convergence of CPU and GPU architectures, the programming challenges architecture changes pose, and possible solutions to these challenges.
This Ph.D. dissertation by Aaron Lefohn at the University of California, Davis describes the Glift GPU data structure abstraction and its application to both GPU-based data-parallel and interactive rendering algorithms. The applications include octree 3D painting, adaptive shadow maps, resolution matched shadow maps, heat-diffusion depth-of-field, and a GPU-based direct solver for tridiagonal linear systems. While much of this work has been posted previously, this dissertation contains a more in-depth discussion of the Glift data structure library and introduces several GPGPU and rendering algorithms that are not yet published. This dissertation demonstrates that a data structure abstraction for GPUs can simplify the description of new and existing data structures, stimulate development of complex GPU algorithms, and perform equivalently to hand-coded implementations. The dissertation also presents a case that future interactive rendering solutions will be an inseparable mix of general-purpose, data-parallel algorithms and traditional graphics programming. (Aaron Lefohn, “Glift: Generic Data Structures for Graphics Hardware”, Ph.D. dissertation, Computer Science Department, University of California Davis, September 2006.)
This Pixar Animation Studios Technical Report by Kass, Lefohn, and Owens describes a GPU-based data-parallel direct tridiagonal linear solver. To the authors’ knowledge, this is the first reported direct, linear-time tridiagonal GPU solver. The solver is used to implement a new heat-diffusion-based depth-of-field preview algorithm; and the paper describes solving thousands of tridiagonal systems, each with hundreds of elements, on the GPU at interactive rendering rates. The alternating direction implicit solution gives rise to separable spatially varying recursive (infinite-impulse response, IIR) filters that can compute large-kernel convolutions in constant time per pixel while respecting the boundaries between in-focus and out-of-focus objects. Recursive filters have traditionally been viewed as problematic for GPUs, but using the well-established method of cyclic reduction of tridiagonal systems, the authors are able to parallelize the computation and implement an efficient solution in terms of GPGPU primitives. (Michael Kass, Aaron Lefohn, and John Owens. Interactive Depth of Field Using Simulated Diffusion on the GPU, Technical Report #06-01, Pixar Animation Studios, January 2006.)
An article by David Strom in Information Week includes “Advanced Graphics Processing” in it’s article “5 Disruptive Technologies To Watch in 2007″, and specifically mentions GPGPU and NVIDIA CUDA. “In some cases, the new graphics cards being developed by NVIDIA and ATI (now a part of AMD) will have a bigger impact on computational processing than the latest chips from Intel and AMD.”, writes Strom.
Videos of all presentations in the GPGPU Tutorials held at SIGGRAPH 2004 and SIGGRAPH 2005 are online. These courses are an excellent resource for beginners in GPGPU programming. SIGGRAPH 2004 GPGPU Course (Course Web Page). SIGGRAPH 2005 GPGPU Course (Course Web Page).
We have added links to some great introductory GPGPU tutorials to the Developer Page. These tutorials, written by Dominik Göddeke from Dortmund University, cover basic GPGPU concepts, parallel reductions, and fast data transfers.
With their upcoming publication in Computer Graphics Forum, Owens et al. have revised their 2005 comprehensive survey of the history and state of the art in GPGPU. It describes, summarizes and analyzes the latest research in mapping general-purpose computation to graphics hardware. The report begins with the technical motivations that underlie general-purpose computation on graphics processors (GPGPU) and describe the hardware and software developments that have led to the recent interest in this field. The authors describe the techniques used in mapping general-purpose computation to graphics hardware, and survey and categorize the latest developments in general-purpose application development on graphics hardware. (A Survey of General-Purpose Computation on Graphics Hardware. John D. Owens, David Luebke, Naga Govindaraju, Mark Harris, Jens Krüger, Aaron E. Lefohn, Timothy J. Purcell, in “Computer Graphics Forum”, Volume 26, 2007. To appear.)
Brahma is an open source shader meta-programming framework for the .NET platform that generates shader code from IL at runtime, enabling developers to write GPU code in C# (or any NET language). The library is primarily meant to handle GPU-based rendering and computational tasks, and eliminates a great deal of glue code that is often required in GPU programming. Since Brahma is a set of interfaces and base classes, it can be implemented for any combination of API and shading language. At this time there is a working shader generation path for Managed DirectX/HLSL. (http://brahma.ananthonline.net)
This tutorial explains how global illumination rendering methods can be implemented on Shader Model 3.0 GPUs. These algorithms do not follow the conventional local illumination model of DirectX/OpenGL pipelines, but require global geometric or illumination information when shading a point. In addition to the theory and state of the art of these approaches, the tutorial goes into the details of a few algorithms, including mirror reflection, refraction, caustics, diffuse/glossy indirect illumination, precomputation-aided global illumination for surface and volumetric models, obscurances and tone mapping, also giving their GPU implementation in HLSL or Cg language. (Laszlo Szirmay-Kalos, Laszlo Scecsi, Mateu Sbert: GPUGI: Global Illumination Effects on the GPU. Eurographics 2006 Tutorial.)