Fast JPEG codec from Fastvideo

September 22nd, 2013

Fastvideo have released their JPEG codec for NVIDIA GPUs. Peak performance of the codec reaches 6 GBytes per second and higher for images loadedfrom host RAM. For instance, a full-color 4K image with resolution 3840 x 2160 can be compressed by 10 times in merely 6 milliseconds on NVIDIA GeForce GTX Titan. More information:

Benchmarking Analytical Queries on a GPU

May 20th, 2012

This report describes advantages of using GPUs for analytical queries. It compares performance of the Alenka database engine using a GPU with the performance of Oracle on a  SPARC server. More information on Alenka including source code:

Alenka – A GPU database engine including compression

November 28th, 2011

Support for several types of compression has been added to the GPU-based database engine ålenkå . Supported algorithms include FOR (frame of reference), FOR-DELTA and dictionary compression. All compression algorithms run on the GPU achieving gigabytes per second compression and decompression speed. The use of compression allows to significantly reduce or eliminate I/O bottlenecks in analytical queries as shown by ålenkå’s results in the Star Schema and TPC-H benchmarks.

Database Compression on Graphics Processors

September 11th, 2010


Query co-processing on graphics processors (GPUs) has become an effective means to improve the performance of main memory databases. However, this co-processing requires the data transfer between the main memory and the GPU memory via a low-bandwidth PCI-E bus. The overhead of such data transfer becomes an important factor, even a bottleneck, for query co-processing performance on the GPU. In this paper, we propose to use compression to alleviate this performance problem. Specifically, we implement nine lightweight compression schemes on the GPU and further study the combinations of these schemes for a better compression ratio. We design a compression planner to find the optimal combination. Our experiments demonstrate that the GPU-based compression and decompression achieved a processing speed up to 45 and 56 GB/s respectively. Using partial decompression, we were able to significantly improve GPU-based query co-processing performance. As a side product, we have integrated our GPU-based compression into MonetDB, an open source column-oriented DBMS, and demonstrated the feasibility of offloading compression and decompression to the GPU.

(Wenbin Fang, Bingsheng He, Qiong Luo: “Database Compression on Graphics Processors”, PVLDB/VLDB 2010. Link to PDF.)

Toward Real-Time Fractal Image Compression Using Graphics Hardware

October 17th, 2005

This ISVC 2005 paper by Ugo Erra presents parallel fractal image compression using programmable graphics hardware. The main problem of fractal compression is the very high computing time needed to encode images. The implementation in this paper exploits the SIMD architecture and inherent parallelism of recent GPUs to speed up the baseline approach of fractal encoding. The results presented are achieved on inexpensive and widely available graphics boards. (Toward Real-Time Fractal Image Compression Using Graphics Hardware. Ugo Erra. In Proceedings of International Symposium on Visual Computing 2005)

DuoDecim – A Structure for Point Scan Compression and Rendering

May 26th, 2005

This paper presents a compression scheme for large point scans including per-point normals. For the encoding of such scans, the paper introduces a type of closest sphere packing grids, the hexagonal close packing (HCP). To compress the data, linear sequences of filled cells in HCP grids are extracted. Point positions and normals in these runs are incrementally encoded. At a grid spacing close to the point sampling distance, the compression scheme only requires slightly more than 3 bits per point position. Incrementally encoded per-point normals are quantized at high fidelity using only 5 bits per normal. The compressed data stream can be decoded in the graphics processing unit (GPU). Decoded point positions are saved in graphics memory, and they are then used on the GPU again to render point primitives. In this way gigantic point scans are rendered from their compressed representation in local GPU memory at interactive frame rates. (

Accelerating Wavelet Transformations with Graphics Hardware

February 19th, 2004

Two papers from the VIS Group Stuttgart describe implementations of wavelet-based multi-resolution analysis using OpenGL. Wavelets are commonly used for signal processing and image compression (e.g. for JPEG 2000). The papers focus on details of implementing wavelet decomposition and reconstruction using graphics hardware, and develop a scaled version of wavelet analysis that constrains data to the [0,1] range of fixed-point frame buffers. See also the project page for more about hardware-based filtering. (Hardware-Based Wavelet Transformations. Matthias Hopf and Thomas Ertl. Workshop on Vision, Modeling, and Visualization 1999, pp 317-328. Hardware-Accelerated Wavelet Transformations. Matthias Hopf and Thomas Ertl. Proc. EG/IEEE TCVG Symposium on Visualization VisSym 2000, pp 93-103.)