CUDPP: CUDA Data Parallel Primitives Library

CUDPP is the CUDA Data Parallel Primitives Library. CUDPP is a library of data-parallel algorithm primitives such as parallel prefix-sum (“scan”), parallel sort and parallel reduction. Primitives such as these are important building blocks for a wide variety of data-parallel algorithms, including sorting, stream compaction, and building data structures such as trees and summed-area tables. CUDPP runs on processors that support CUDA.

CUDPP was initially developed to test the algorithms developed in CUDA for the articles “Parallel Prefix Sum (Scan) in CUDA”, by Mark Harris, Shubho Sengupta, and John Owens (published in GPU Gems 3), and “Scan Primitives for GPU Computing”, by Shubho Sengupta, Mark Harris, Yao Zhang, and John Owens (published in the proceedings of Graphics Hardware 2007).

CUDPP 1.0 alpha

CUDPP 1.0 is here! We’ve revamped the CUDPP public interface to improve usability, and we’ve added a bunch of new features, with even more to come. Because this interface is quite new, we want to give people a chance to try it out and send us feedback on any issues they discover. For this reason we’ve called this an alpha release — we may yet tweak the interface. In addition, there are a couple of features that didn’t make this release that we hope to add before 1.0 final is released. If you have feedback on the new interface, please tell us on the CUDPP Google Group

CUDPP 1.0 New Features and Improvements

  • New Plan Interface for configuring CUDPP algorithms. This is modeled after the FFTW and CUFFT libraries.
  • Segmented Scan: an algorithm for performing multiple variable-length scans in parallel. Useful for algorithms such as parallel quicksort, parallel sparse matrix-vector multiplication, and more.
  • Sparse Matrix-Vector Multiplication (based on segmented scan)
  • An improved scan algorithm, called “warp scan”, for higher performance and simpler code.
  • Scans and segmented scans now support add, multiply, maximum, and minimum operators.
  • Inclusive scans and segmented scans are now supported.
  • Improved, more useful, cudppCompact() interface.
  • Backward compact (reverse-and-compact) is now supported.
  • CUDA 2.0 support
  • Support for Mac OS X and Windows Vista

Downloading CUDPP

CUDPP 1.0 alpha Release

20 April 2008

Download CUDPP 1.0 alpha Source: cudpp_1.0a.tar.gz (2.7 MB tar.gz File) cudpp_1.0a.zip (2.8 MB ZIP File)
Download HTML CUDPP documentation: cudpp_doc-rel_gems3-2.tar.gz (143 KB tar.gz File)

Older Releases

Archive of Older CUDPP Releases

CUDPP Documentation

For installation and usage instructions, please refer to the Documentation. The documentation is comprehensive and searchable. For an example of using CUDPP, see the simpleCUDPP example.

CUDPP Google Group

We will be using the CUDPP Google Group for CUDPP announcements and discussion. Sign up if you’d like to participate.

Google Groups
Subscribe to CUDPP
Email:
Visit this group

CUDPP References

CUDPP Developers

Other Contributors

Thanks To

  • Jim Ahrens
  • Ian Buck
  • Guy Blelloch
  • Jeff Bolz
  • Jeff Inman
  • Eric Lengyel
  • David Luebke
  • Pat McCormick
  • and Richard Vuduc

This work was supported by the Department of Energy (Early Career Principal Investigator Award DE-FG02-04ER25609, the SciDAC Institute for Ultrascale Visualization, and Los Alamos National Laboratory) and by the National Science Foundation (grant 0541448), as well as generous hardware donations from NVIDIA.