Eight GPGPU Papers Presented at ICCS 2006 GPGPU Workshop

June 20th, 2006

Abstracts, citations and links to author homepages of eight papers on GPGPU presented at the ICCS conference, Reading, UK, May 2006, are available. Topics include genome sequencing, GPGPU languages, database operations, computational fluid dynamics, computer vision, computational geometry and neural networks. (http://www.mathematik.uni-dortmund.de/~goeddeke/iccs/papers.html)

Cholesky Decomposition and Linear Programming on a GPU

June 6th, 2006

Rapid evolution of GPUs in performance, architecture, and programmability provides general and scientific computational potential beyond their primary purpose, graphics processing. This work presents an efficient algorithm for solving symmetric and positive definite linear systems using the GPU. Using the decomposition algorithm and other basic building blocks for linear algebra on the GPU, the paper demonstrates a GPU-powered linear program solver based on a Primal-Dual Interior-Point Method. (Cholesky Decomposition and Linear Programming on a GPU, Jin Hyuk Jung, Scholarly Paper Directed by Dianne P. O’Leary, Department of Computer Science, University of Maryland, 2006.)

GPUFFTW: High Performance GPU-based FFT Library

May 30th, 2006

This paper by Govindaraju et al. describes a high-performance FFT algorithm on GPUs. The algorithm is highly tuned for GPUs using memory optimizations. It further improves performance using pipelining strategies. In practice, it is able to achieve 4x higher computational performance on a $500 NVIDIA GPU than optimized single precision FFT algorithms on high-end CPUs costing $1500. (“Efficient memory model for scientific algorithms on graphics processors”, Naga Govindaraju, Scott Larsen, Jim Gray and Dinesh Manocha, UNC Tech. Report 2006)

Universal employment of modern graphics hardware by the example of the optimization of a speech recognition system

May 24th, 2006

In this masters thesis by Christian Fenzl (accomplished at the University of Applied Sciences in Darmstadt), an easy to use framework is implemented with additional demos to show the main concepts of gpgpu. Furthermore, a demo implementation is included which calculates scores on feature vectors used in a speech recognition system (about 12 times faster than an equivalent cpu implementation). An application with several demos using the framework including the fully documented source code (English) and the paper itself (German) is available. The framework code is recommended especially for gpgpu beginners to look into the OpenGL and DirectX code which shows how gpgpu programs can be developed.

Floating-Point Computation with Just Enough Accuracy

May 24th, 2006

This paper by Dietz et al. from ICCS 2006 details and microbenchmarks the use of pairs of native precision values to obtain higher accuracy results using DSP, SWAR, and GPU hardware. It also dicusses a way to speculatively use lower precision, recomputing with higher precision only when accuracy constraints are not met.(Floating-Point Computation with Just Enough Accuracy)

GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management

April 4th, 2006

GPUTeraSort sorts billion-record wide-key databases using the data and task parallelism on the graphics processing unit (GPU) to perform memory-intensive and compute-intensive tasks while the CPU performs I/O and resource management. It exploits both the high-bandwidth GPU memory interface and the lower-bandwidth CPU main memory interface to achieve higher aggregate memory bandwidth than purely CPU-based algorithms. It also pipelines disk transfers to achieve near-peak I/O performance. GPUTera-Sort is a two-phase task pipeline: (1) read disk, build keys, sort using the GPU, generate runs, write disk, and (2) read, merge, write. We tested the performance of GPUTeraSort on billion-record files using the standard Sort benchmark. In practice, a 3 GHz Pentium IV PC with $265 NVIDIA 7800 GT GPU is significantly faster than optimized CPU-based algorithms on much faster processors, sorting 60GB for a penny; the best reported PennySort price-performance. These results suggest that a GPU co-processor can significantly improve performance on large data processing tasks. (GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management. Naga K. Govindaraju, Jim Gray, Ritesh Kumar, and Dinesh Manocha. Proceedings of ACM SIGMOD 2006.)

Fast GPU Ray Tracing of Dynamic Meshes using Geometry Images

March 17th, 2006

Using the GPU to accelerate ray tracing may seem like a natural choice due to the highly parallel nature of the problem. However, determining the most versatile GPU data structure for scene storage and traversal is a challenge. In this paper, we introduce a new method for quick intersection of triangular meshes on the GPU. The method uses a threaded bounding volume hierarchy built from a geometry image, which can be efficiently traversed and constructed entirely on the GPU. This acceleration scheme is highly competitive with other GPU ray tracing methods, while allowing for both dynamic geometry and an efficient level of detail scheme at no extra cost. (Fast GPU Ray Tracing of Dynamic Meshes using Geometry Images Nathan A. Carr, Jared Hoberock, Keenan Crane, and John C. Hart. To appear in Proceedings of Graphics Interface 2006)

Jump Flooding in GPU with Applications to Voronoi Diagram and Distance Transform

March 12th, 2006

This paper studies jump flooding as an algorithmic paradigm in general-purpose computation with GPUs. As an example application of jump flooding, the paper discusses a constant time algorithm on the GPU to compute an approximation to the Voronoi diagram of a given set of seeds in a 2D grid. The errors due to the differences between the approximation and the actual Voronoi diagram are hardly noticeable to the naked eye in all presented experiments. The same approach can also compute in constant time an approximation to the distance transform of a set of seeds in a 2D grid. In practice, such constant time algorithm is useful to many interactive applications involving, for example, rendering and image processing. Besides the experimental evidence, this paper also confirms quantitatively the effectiveness of jump flooding by analyzing the occurrences of errors. The analysis is a showcase of insights to the jump flooding paradigm, and may be of independent interest to other applications of jump flooding. (Jump Flooding in GPU with Applications to Voronoi Diagram and Distance Transform. Guodong Rong and Tiow-Seng Tan. To appear in 2006 SIGGRAPH Symposium on Interactive 3D Graphics and Games. [I3D 2006] )

Glift: Generic, Efficient, Random-Access GPU Data Structures

February 10th, 2006

This paper presents Glift, an abstraction and generic template library for defining complex, random-access graphics processor (GPU) data structures. Like modern CPU data structure libraries, Glift enables GPU programmers to separate algorithms from data structure definitions; thereby greatly simplifying algorithmic development and enabling reusable and interchangeable data structures. We characterize a large body of previously published GPU data structures in terms of our abstraction and present several new GPU data structures. The structures, a stack, quadtree, and octree, are explained using simple Glift concepts and implemented using reusable Glift components. We also describe two applications of these structures not previously demonstrated on GPUs: adaptive shadow maps and octree 3D paint. Lastly, we show that our example Glift data structures perform comparably to handwritten implementations while requiring only a fraction of the programming effort. (Glift: Generic, Efficient, Random-Access GPU Data Structures. Aaron E. Lefohn, Joe Kniss, Robert Strzodka, Shubhabrata Sengupta, John D. Owens. ACM Transactions on Graphics, 25(1), Jan. 2006.)

Dynamic Particle Coupling for GPU-based Fluid Simulation

February 10th, 2006

This paper by Kolb and Cuntz from the computer graphics group of the University of Siegen describes a 3D flow simulation approach on the GPU. The flow simulation is modeled using the so-called Smoothed Particle Hydrodynamics approach. The presented technique combines the particle simulation, presented by Kipfer et.al. and Kolb et.al. at Graphics-Hardware 2004, with a grid based approach. 3D grids are used to store intermediate flow quantities that are distributed by the particles in their neighborhood. MRT is used to compute the contribution of a single particle to four texture slices simultaneously. Blending the individual contribution of all particles in 16bit textures yields the final 3D force fields used for the simulation of particle motion. (Dynamic Particle Coupling for GPU-based Fluid Simulation)

Page 20 of 31« First...10...1819202122...30...Last »