This paper presents an extensible system for interactively rendering multiple types of ray-casted objects in a manner compatible with pre-existing rendering engines. The sample implementation includes support for general quadrics and volumetric isosurfaces. It also includes a high-speed sphere renderer, and of course a standard triangle-rendering pipeline. The system is designed so that most of the algorithms designed to run on the existing raster engine can be added with minimal overhead/coding effort. We have demonstrated shadowing using the shadow-map algorithm. (“Beyond Triangles: A Simple Framework For Hardware-Accelerated Non-Triangular Primitives”, To be Submitted for publication.)
These works from the Database Systems Lab at UC Santa Barbara describe how a graphics processor can be effectively used to accelerate the performance of spatial database (GIS databases) operations. Spatial database operations, especially which involve polygon datasets, have been known to be computationally expensive. Sun et al. describe a novel hardware / software co-processing technique which uses basic features of a GPU to reduce the spatial query processing cost. Experimental evaluation shows that their hardware-based approach can significantly outperform leading software-based techniques. (Hardware Acceleration for Spatial Selections and Joins Chengyu Sun, Divyakant Agrawal, Amr El Abbadi. Proceedings of SIGMOD 2003.) However, this evaluation is done in a stand-alone setting where there are no indices, preprocessing or other optimizations available in a database. Bandi et al. extend Sun et al.’s work and integrate the hardware-based technique into a popular commercial database. Rigorous experimentation over real-life data sets shows that the hardware-based approach is very effective and can be complimentary to the optimizations available in a commercial database setting. (Hardware Acceleration in Commercial Databases: A Case Study of Spatial Operations Nagender Bandi, Chengyu Sun, Divyakant Agrawal, Amr El Abbadi to appear in VLDB 2004.)
Modern GPUs perform floating point math and read data from off-chip memory at rates roughly five times that of a fast Pentium 4 CPU. However, the performance of algorithms for computing dense matrix-matrix products on GPUs has lagged behind that of good CPU implementations. In this paper, we show why this result is not an artifact of poorly designed algorithms, and explain how present-day graphics architectures are highly inefficient for computations such as matrix-matrix multiplication that involve significant data reuse. (Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication. Kayvon Fatahalian, Jeremy Sugerman, and Pat Hanrahan.)
This paper by Larsen et al. at Technical University of Denmark introduces a fast GPU accelerated technique for simulating photon mapping. Each of the steps in the photon mapping algorithm are executed either on the CPU or the GPU depending on which of the processors are most appropriate for the task. The indirect illumination is calculated using a new GPU accelerated final gathering method. Caustic photons are traced on the CPU and then drawn using points in the framebuffer, and finally filtered using the GPU. Both diffuse and non-diffuse surfaces are handled by calculating the direct illumination on the GPU and the photon tracing on the CPU. (Simulating Photon Mapping for Real-time Applications. Bent D. Larsen, Niels J. Christensen, To appear at Eurographics Symposium on Rendering, 2004.)
This paper by Govindaraju et al. describes new algorithms for performing fast computation of several common database operations on commodity graphics processors. Specifically, the paper considers operations such as conjunctive selections, aggregations, and semi-linear queries, which are essential computational components of typical database, data warehousing, and data mining applications. The proposed algorithms take into account some of the limitations of the programming model of current GPUs and perform no data rearrangements. These algorithms have been implemented on a programmable GPU (e.g. NVIDIA’s GeForce FX 5900) and applied to databases consisting of up to a million records. The paper compares their performance with an optimized CPU-based implementation. The experiments indicate that the graphics processor available on commodity computer systems is an effective coprocessor for performing database operations. (Fast Database Operations using Graphics Processors. Naga K. Govindaraju, Brandon Lloyd, Wei Wang, Ming C. Lin, Dinesh Manocha to appear at SIGMOD 2004.)
This paper explores the plausibility of using the GPU for numerical simulations on structured grids (lattices). The paper (1) reviews previous work on using GPUs for non-graphics applications, (2) implements probability-based simulations on the GPU, namely the Ising and percolation models, (3) implements vector operation benchmarks for the GPU, and (4) compares CPU and GPU performance. The original contribution of this work is implementing Monte Carlo type simulations on the GPU. Such simulations have a wide area of applications. They are computationally intensive and, as shown in the paper, lend themselves naturally to implementation on GPUs, providing a computational speedup. A general conclusion from the results obtained is that moving computations from the CPU to the GPU is feasible, yielding good time and price performance for certain lattice computations. Preliminary results also show that it is feasible to use GPUs in parallel. (S.Tomov, M.McGuigan, R.Bennett, G.Smith, J.Spiletic. Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards, to appear in Computers & Graphics.)
This SIGGRAPH 2004 paper by Buck et al. presents Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. The paper presents a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, the paper provides analysis of the effectiveness of the GPU as a compute engine compared to the CPU, to determine when the GPU can outperform the CPU for a particular algorithm. The paper evaluates the system with five applications, the SAXPY and SGEMV BLAS operators, image segmentation, FFT, and ray tracing. For these applications, the Brook implementations perform comparably to hand-written GPU code and up to seven times faster than their CPU counterparts. (Brook for GPUs: Stream Computing on Graphics Hardware. Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, and Pat Hanrahan. To appear at SIGGRAPH 2004.)
The OpenVIDIA project is a GPL’d Free/Open Source project which implements computer vision algorithms on the GPU using OpenGL and Cg. These papers describe the GPU implementation of a projective image registration algorithm in OpenVIDIA. The current release includes a hand-tracking program used for a gesture recognition interfaces, some simple programming examples, and firewire video input support. OpenVIDIA explores the use of GPU hardware accelerated computer vision in the context of creating Computer Mediated Reality. (James Fung, Felix Tang, Steve Mann, “Mediated Reality Using Computer Graphics Hardware for Computer Vision“, Proceedings of the International Symposium on Wearable Computing 2002 (ISWC2002), Seattle, Washington, USA, Oct 7-10, 2002, pp. 83-89.
James Fung, Steve Mann, “Computer Vision Signal Processing on Graphics Processing Units“, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal, Quebec, Canada, May 17-21, 2004.)
Given the increasing power and usage of commodity GPUs, many researchers are using them for general-purpose computation. The ACM Workshop on General-Purpose Computation on Graphics Processors (GP2), to be held the Saturday and Sunday before SIGGRAPH 2004 at one of the SIGGRAPH hotels, will explore current issues in general-purpose computing using graphics hardware. These issues include:
- Do GPUs have the potential of being a useful co-processor for a wide variety of applications?
What are their algorithmic and architectural niches and can these be broadened?
- What are the major issues in terms of programmability, language and compiler support and software environments for GPUs?
- What are some of the future technology trends that can lead to more widespread use of GPUs?
This workshop will bring together leading researchers and practitioners from academia, research labs and industry working in computer graphics, scientific computation, high performance computing,
omputer architecture and related areas. The program will consist of invited talks, panels and poster presentations. (ACM GP2 Workshop. Call for Posters.)
Isosurface Computation Made Simple: Hardware Acceleration, Adaptive Refinement and Tetrahedral StrippingMay 4th, 2004
This paper by Valerio Pascucci describes a simple technique to compute isosurfaces on programmable GPUs. Given the vertices of a tetrahedron a simple vertex program computes the position of the vertices, normal and connectivity of the potential portion of an isosurface contained in the tetrahedron (a marching tet approach). One main advantage of this technique is to offload the CPU of the task of computing the isosurface and more importantly to avoid storing the surface in main memory. Interestingly, one could compile a display list for a tetrahedral mesh and display different isosurfaces by changing an OpenGL parameter and always rendering the same list. The paper presents and comments in detail all the source code of the vertex program. (Isosurface Computation Made Simple: Hardware Acceleration, Adaptive Refinement and Tetrahedral Stripping. V. Pascucci, Proceedings of VisSym 2004)