Back 40 Computing: High Performance GPU Building Blocks

August 22nd, 2010

The Back 40 Computing project aims at providing a collection of high performance GPU computing building blocks. It is maintained by Duane Merrill from the University of Virginia. Highlights of the current release include the fastest  Radix Sort implementation on GPUs to date, capable of sorting over 1 billion keys per second. For more details you can also see this (pre-Fermi) Techreport (direct PDF link).

Source code and documentation are available on Google Code.