CUDPP is the CUDA Data Parallel Primitives Library. CUDPP is a library of data-parallel algorithm primitives such as parallel prefix-sum (“scan”), parallel sort and parallel reduction. Primitives such as these are important building blocks for a wide variety of data-parallel algorithms, including sorting, stream compaction, and building data structures such as trees and summed-area tables. CUDPP runs on processors that support CUDA.
CUDPP was initially developed to test the algorithms developed in C for CUDA for the articles “Parallel Prefix Sum (Scan) in CUDA”, by Mark Harris, Shubho Sengupta, and John Owens (published in GPU Gems 3), and “Scan Primitives for GPU Computing”, by Shubho Sengupta, Mark Harris, Yao Zhang, and John Owens (published in the proceedings of Graphics Hardware 2007).
CUDPP 1.1
The two major new features in CUDPP 1.1 are a very fast new radix sort implementation with support for sorting key-value pairs (with float or unsigned integer keys); and a new pseudorandom number generator, cudppRand.
CUDPP 1.1 also has a new license. We have replaced the former CUDPP license with a pure, standard BSD license. This greatly simplifies the CUDPP license details, and it also enables us to move the CUDPP code into a public source repository such as Google Code.
CUDPP 1.1 New Features and Improvements
- New radix sort implementation under cudppSort() (based on Satish et al. IPDPS 2009 paper). All previous sorts have been removed. This new sort is much faster; at release time, this was the fastest published GPU sorting algorithm.
- Added cudppRand() pseudorandom number generation (based on Tzeng and Wei I3D 2008 paper).
- Added support for backward segmented scan.
- Switched from the previous license to a pure BSD license
- Fixed satGL example to run in a native window on OS X, rather than an X11 window.
- Removed Visual Studio 7.1 (2003) project files. CUDA 2.1 and later no longer support VS7.1.
- Miscellaneous bug fixes.
- In the documentatation, added a list of publications that use CUDPP, including both text and bibtex citation format.
- In the documentation, updated the list of publications of algorithms included in CUDPP.
- Miscellaneous Documentation improvements.
Downloading CUDPP
CUDPP 1.1 Release
1 July 2008
- Download CUDPP 1.1 Source:
- cudpp_src_1.1.zip (2.8 MB ZIP File)
- cudpp_src_1.1.tar.gz (2.8 MB tar.gz File)
- Download HTML CUDPP documentation (also see online documentation below):
- cudpp_doc_1.1.zip (280 KB ZIP File)
- cudpp_doc_1.1.tar.gz (180 KB tar.gz File)
Data Files
Download cudppRand() regression files for cudpp_testrig:
- cudpp_rand_datafiles.zip (75MB ZIP File)
- cudpp_rand_datafiles.tar.gz (75MB tar.gz File)
Older Releases
Archive of Older CUDPP Releases
CUDPP Documentation
For installation and usage instructions, please refer to the Documentation. The documentation is comprehensive and searchable. For an example of using CUDPP, see the simpleCUDPP example.
CUDPP Google Group
We will be using the CUDPP Google Group for CUDPP announcements and discussion. Sign up if you’d like to participate.
| Subscribe to CUDPP |
| Email: |
| Visit this group |
CUDPP on Google Code
To report CUDPP bugs or request features, you may use either the above CUDPP Google Group, or you can file an issue directly using Google Code.
CUDPP References
The following publications describe work incorporated in CUDPP.
- Mark Harris, Shubhabrata Sengupta, and John D. Owens. “Parallel Prefix Sum (Scan) with CUDA”. In Hubert Nguyen, editor, GPU Gems 3, chapter 39, pages 851–876. Addison Wesley, August 2007. http://graphics.idav.ucdavis.edu/publications/print_pub?pub_id=916
- Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens. “Scan Primitives for GPU Computing”. In Graphics Hardware 2007, pages 97–106, August 2007. http://graphics.idav.ucdavis.edu/publications/print_pub?pub_id=915
- Shubhabrata Sengupta, Mark Harris, and Michael Garland. “Efficient parallel scan algorithms for GPUs”. NVIDIA Technical Report NVR-2008-003, December 2008. http://mgarland.org/papers.html#segscan-tr
- Nadathur Satish, Mark Harris, and Michael Garland. “Designing Efficient Sorting Algorithms for Manycore GPUs”. Proc. 23rd IEEE Int�l Parallel & Distributed Processing Symposium, May 2009. http://mgarland.org/papers.html#gpusort
- Stanley Tzeng, Li-Yi Wei. “Parallel white noise generation on a GPU via cryptographic hash”. Proc. 2008 symposium on Interactive 3D graphics and games. pages 79–87. http://research.microsoft.com/apps/pubs/default.aspx?id=70502
Research that Cites CUDPP
Many researchers are using CUDPP in their work, and there are many publications that have used it (references). If your work uses CUDPP, please let us know by sending us a BibTeX reference to your work.
If you make use of CUDPP primitives in your work and want to cite CUDPP (thanks!), we would prefer if you would cite the appropriate papers above, since they form the core of CUDPP. To be more specific, the GPU Gems paper describes (unsegmented) scan and multi-scan for summed-area tables. The NVIDIA technical report describes the current scan and segmented scan algorithms used in the library, and the Graphics Hardware paper describes an earlier implementation of segmented scan, quicksort, and sparse matrix-vector multiply. The IPDPS paper describes the radix sort used in CUDPP, and the I3D paper describes the random number generation algorithm.
CUDPP Developers
- Mark Harris
- John Owens
- Shubho Sengupta
- Stanley Tseng
- Yao Zhang
- Andrew Davidson
Other Contributors
Thanks To
- Jim Ahrens
- Ian Buck
- Guy Blelloch
- Jeff Bolz
- Jeff Inman
- Eric Lengyel
- David Luebke
- Pat McCormick
- and Richard Vuduc
This work was supported by the Department of Energy (Early Career Principal Investigator Award DE-FG02-04ER25609, the SciDAC Institute for Ultrascale Visualization, and Los Alamos National Laboratory) and by the National Science Foundation (grant 0541448), as well as generous hardware donations from NVIDIA.
CUDPP Copyright and Software License
CUDPP is copyright The Regents of the University of California, Davis campus and NVIDIA Corporation. The library, examples, and all source code are released under the BSD license, designed to encourage reuse of this software in other projects, both commercial and non-commercial.
Note that prior to release 1.1 of CUDPP, the license used was a modified BSD license. With release 1.1, this license was replaced with the pure BSD license to facilitate the use of open source hosting of the code.