Real-time Deblocked GPU rendering of Compressed Volume Data

December 2nd, 2014


The wide majority of current state-of-the-art compressed GPU volume renderers are based on block-transform coding, which is susceptible to blocking artifacts, particularly at low bit-rates. In this paper the authors address the problem for the first time, by introducing a specialized deferred filtering architecture working on block-compressed data and including a novel deblocking algorithm. The architecture efficiently performs high quality shading of massive datasets by closely coordinating visibility- and resolution-aware adaptive data loading with GPU-accelerated per-frame data decompression, deblocking, and rendering. A thorough evaluation including quantitative and qualitative measures demonstrates the performance of our approach on large static and dynamic datasets including a massive 512^4 turbulence simulation (256GB), which is aggressively compressed to less than 2 GB, so as to fully upload it on graphics board and to explore it in real-time during animation.

(Fabio Marton, José Antonio Iglesias Guitián, Jose Díaz and Enrico Gobbetti: “Real-time deblocked GPU rendering of compressed volumes”. Proc. 19th International Workshop on Vision, Modeling and Visualization (VMV), pp. 167-174, Oct. 2014. [WWW])

Improving Cache Locality for GPU-based Volume Rendering

June 8th, 2014


We present a cache-aware method for accelerating texture-based volume rendering on a graphics processing unit (GPU). Because a GPU has hierarchical architecture in terms of processing and memory units, cache optimization is important to maximize performance for memory-intensive applications. Our method localizes texture memory reference according to the location of the viewpoint and dynamically selects the width and height of thread blocks (TBs) so that each warp, which is a series of 32 threads processed simultaneously, can minimize memory access strides. We also incorporate transposed indexing of threads to perform TB-level cache optimization for specific viewpoints. Furthermore, we maximize TB size to exploit spatial locality with fewer resident TBs. For viewpoints with relatively large strides, we synchronize threads of the same TB at regular intervals to realize synchronous ray propagation. Experimental results indicate that our cache-aware method doubles the worst rendering performance compared to those provided by the CUDA and OpenCL software development kits.

(Yuki Sugimoto, Fumihiko Ino, and Kenichi Hagihara: “Improving Cache Locality for GPU-based Volume Rendering”. Parallel Computing 40(5/6): 59-69, May 2014. [DOI])

Physically based lighting for volumetric data with Exposure Render

October 27th, 2011

Exposure Render is a Direct Volume Rendering Application that applies progressive Monte Carlo raytracing, coupled with physically based light transport to heterogeneous volumetric data. Exposure Render enables the configuration of any number of arbitrarily shaped area lights, models a real-world camera, including its lens and aperture, and incorporates complex materials, whilst still maintaining interactive display updates. It features both surface and volumetric scattering, and applies noise reduction to remove the unwanted startup noise associated with progressive Monte Carlo rendering. The complete implementation is available in source and binary forms under a permissive free software license.

Efficient High-Quality Volume Rendering of SPH Data

September 27th, 2010

Efficient High-Quality Volume Rendering of SPH DataAbstract:

High quality volume rendering of SPH data requires a complex order-dependent resampling of particle quantities along the view rays. In this paper we present an efficient approach to perform this task using a novel view-space discretization of the simulation domain. Our method draws upon recent work on GPU-based particle voxelization for the efficient resampling of particles into uniform grids. We propose a new technique that leverages a perspective grid to adaptively discretize the view-volume, giving rise to a continuous level-of-detail sampling structure and reducing memory requirements compared to a uniform grid. In combination with a level-of-detail representation of the particle set, the perspective grid allows effectively reducing the amount of primitives to be processed at run-time. We demonstrate the quality and performance of our method for the rendering of fluid and gas dynamics SPH simulations consisting of many millions of particles.

(Roland Fraedrich, Stefan Auer, and Rüdiger Westermann: “Efficient High-Quality Volume Rendering of SPH Data”, IEEE Transactions on Visualization and Computer Graphics (Proceedings of IEEE Visualization 2010), vol. 16, no. 6, Nov.-Dec. 2010, Link to project webpage including paper, pictures and video)

HPMC open-source GPU volumetric iso-surface extraction library

November 30th, 2009

HPMC is a small OpenGL/C/C++-library that extracts iso-surfaces of volumetric data directly on the GPU.

The library analyzes a lattice of scalar values describing a scalar field that is either stored in a Texture3D or can be accessed through an application-provided snippet of shader code. The output is a sequence of vertex positions and normals that form a triangulation of the iso-surface. HPMC provides traversal code to be included in an application vertex shader, which allows direct extraction in the vertex shader. Using the OpenGL transform feedback mechanism, the triangulation can be stored directly into a buffer object.

(C. Dyken, G. Ziegler, C. Theobalt, H.-P. Seidel, High-speed Marching Cubes using Histogram Pyramids, Computer Graphics Forum 27 (8), 2008.)

Fourier Volume Rendering on the GPU Using a Split-Stream FFT

March 1st, 2005

This paper by Jansen et al. describes how to utilize current commodity graphics hardware to perform Fourier volume rendering directly on the GPU. The paper presents a novel implementation of the Fast Fourier Transform: This Split-Stream-FFT maps the recursive structure of the FFT to the GPU in an efficient way. Additionally, high-quality resampling within the frequency domain is discussed. The implementation enables visualization of large volumetric data sets at interactive frame rates on a mid-range computer system. (Fourier Volume Rendering on the GPU Using a Split-Stream FFT)

Accelerating 3D Convolution using Graphics Hardware

February 19th, 2004

This paper from the VIS Group Stuttgart shows the first volume filtering algorithm that uses OpenGL for the convolution process. Filtering volume data is useful for noise reduction, feature detection, and segmentation. The process is significantly accelerated on SGI graphics workstations with hardware support for two-dimensional image convolution in the frame buffer. Generic 3D convolution can be added as a powerful tool in interactive volume visualization toolkits. See also the project page for more about hardware-based filtering. (Accelerating 3D Convolution using Graphics Hardware. Matthias Hopf and Thomas Ertl. Proc. Visualization 1999, pp 471–474.)

A Streaming Narrow Band Algorithm: Interactive Computation and Visualization of Level Sets

December 15th, 2003

This TVCG paper is an extended version of a Vis 2003 paper, with significantly more detail about the time-dependent, sparse-grid GPU computation strategy used in the level-set solver. The paper describes a 3D-to-2D virtual memory address scheme for packing the narrow-band data into GPU memory. It also adds detail about the GPU-based distance transform computation and the GPU-to-CPU message passing approach. Lastly, the paper describes a volume rendering algorithm for rendering compressed data that provides on-the-fly reconstruction, full trilinear interpolation, and the ability to render from any viewpoint without data duplication. (A Streaming Narrow Band Algorithm: Interactive Computation and Visualization of Level Sets. A. Lefohn, J. Kniss, C. Hansen, and R. Whitaker. Transactions on Visualization and Computer Graphics.)

Dynamic Volume Computation and Visualization on the GPU

December 15th, 2003

This IEEE Visualization 2003 tutorial presentation by Aaron Lefohn gives a high-level overview of dynamic volume computation and visualization on GPUs. The talk is part of the tutorial Interactive Visualization of Volumetric Data on Consumer PC Hardware. The first half of the presentation discusses various memory layout options for dynamic volume computation, and the implications of each option on computation and rendering. The second half discusses optimizations and load balancing between the various computational resources: CPU, vertex processor, rasterizer, and fragment processor. (Dynamic Volume
Computation and Visualization on the GPU
, by Aaron Lefohn)

Acceleration Techniques for GPU-based Volume Rendering

September 3rd, 2003

This Vis03 paper by Krüger and Westermann addresses the integration of early ray termination and empty-space skipping into texture based volume rendering on graphical processing units (GPU). Therefore, volume ray-casting on programmable graphics hardware is described as an alternative to object-order approaches. The early z-test is exploited to terminate fragment processing once sufficient opacity has been accumulated, and to skip empty space along the rays of sight. Performance gains up to a factor of 3 for typical renditions of volumetric data sets on the ATI 9700 graphics card are demonstrated. (Acceleration Techniques for GPU-based Volume Rendering To appear in IEEE Visualization 2003)