This Ph.D. dissertation by Aaron Lefohn at the University of California, Davis describes the Glift GPU data structure abstraction and its application to both GPU-based data-parallel and interactive rendering algorithms. The applications include octree 3D painting, adaptive shadow maps, resolution matched shadow maps, heat-diffusion depth-of-field, and a GPU-based direct solver for tridiagonal linear systems. While much of this work has been posted previously, this dissertation contains a more in-depth discussion of the Glift data structure library and introduces several GPGPU and rendering algorithms that are not yet published. This dissertation demonstrates that a data structure abstraction for GPUs can simplify the description of new and existing data structures, stimulate development of complex GPU algorithms, and perform equivalently to hand-coded implementations. The dissertation also presents a case that future interactive rendering solutions will be an inseparable mix of general-purpose, data-parallel algorithms and traditional graphics programming. (Aaron Lefohn, “Glift: Generic Data Structures for Graphics Hardware”, Ph.D. dissertation, Computer Science Department, University of California Davis, September 2006.)
Ph.D. Dissertation: Glift Generic GPU Data Structures, by Aaron Lefohn
January 18th, 2007Interactive Depth of Field Using Simulated Diffusion on a GPU
January 18th, 2007This Pixar Animation Studios Technical Report by Kass, Lefohn, and Owens describes a GPU-based data-parallel direct tridiagonal linear solver. To the authors’ knowledge, this is the first reported direct, linear-time tridiagonal GPU solver. The solver is used to implement a new heat-diffusion-based depth-of-field preview algorithm; and the paper describes solving thousands of tridiagonal systems, each with hundreds of elements, on the GPU at interactive rendering rates. The alternating direction implicit solution gives rise to separable spatially varying recursive (infinite-impulse response, IIR) filters that can compute large-kernel convolutions in constant time per pixel while respecting the boundaries between in-focus and out-of-focus objects. Recursive filters have traditionally been viewed as problematic for GPUs, but using the well-established method of cyclic reduction of tridiagonal systems, the authors are able to parallelize the computation and implement an efficient solution in terms of GPGPU primitives. (Michael Kass, Aaron Lefohn, and John Owens. Interactive Depth of Field Using Simulated Diffusion on the GPU, Technical Report #06-01, Pixar Animation Studios, January 2006.)
GPU-ABiSort: Optimal Parallel Sorting on Stream Architectures
August 14th, 2006This paper presents a novel approach for parallel sorting on stream processing architectures. It is based on adaptive bitonic sorting. For sorting n values utilizing p stream processor units, this approach achieves the optimal time complexity O((n log n)/p). This approach is competitive with common sequential sorting algorithms not only from a theoretical viewpoint, it is also very fast from a practical viewpoint. The paper presents an implementation on modern programmable graphics hardware (GPUs). On recent GPUs this optimal parallel sorting approach has shown to be remarkably faster than sequential sorting on the CPU, and it is also faster than previous non-optimal sorting approaches on the GPU for sufficiently large input sequences. (GPU-ABiSort: Optimal Parallel Sorting on Stream Architectures Alexander Gress and Gabriel Zachmann. Proc. 20th IEEE Int’l Parallel and Distributed Processing Symposium (IPDPS), 2006.)
A work-efficient step-efficient prefix-sum algorithm
June 22nd, 2006This extended abstract by Sengupta et al. presents a work-efficient step-efficient prefix-sum algorithm. This algorithm achieves a three to four fold speedup over the step-efficient prefix-sum algorithm presented by Daniel Horn in GPU Gems 2. It can also be tuned to efficiently run on future hardware which would have a higher degree of parallelism. (A work-efficient step-efficient prefix-sum algorithm. Shubhabrata Sengupta, Aaron E. Lefohn, John D. Owens in in Proceedings of the 2006 Workshop on Edge Computing Using New Commodity Architectures.)