NVIDIA Announces CUDA GPU Computing Architecture

November 10th, 2006

NVIDIA Corporation today unveiled NVIDIA CUDA technology, a new architecture for computing on NVIDIA GPUs, and the industry’s first C-compiler development environment for the GPU. From the NVIDIA Press Release:

GPU computing with CUDA is a new approach to computing where hundreds of on-chip processor cores simultaneously communicate and cooperate to solve complex computing problems up to 100 times faster than traditional approaches. This breakthrough architecture is complemented by another first: the NVIDIA C-compiler for the GPU. This complete development environment gives developers the tools they need to solve new problems in computation-intensive applications such as product design, data analysis, technical computing, and game physics. CUDA-enabled GPUs offer dedicated features for computing, including the Parallel Data Cache, which allows 128, 1.35 GHz processor cores in newest generation NVIDIA GPUs to cooperate with each other while performing intricate computations. Developers access these new features through a separate computing driver that communicates with DirectX and OpenGL, and the new NVIDIA C compiler for the GPU, which obsoletes streaming languages for GPU computing.

CUDA website: http://www.nvidia.com/cuda

A New Low-Level Interface for GPGPU Applications on ATI GPUs

August 10th, 2006

At SIGGRAPH in Boston, Derek Gerstmann of ATI presented a sketch titled, “A Performance-Oriented Data Parallel Virtual Machine for GPGPU Applications.” The system exposes GPU functionality at a low-level (including the fragment processors’ native instruction set), giving the programmer direct control over program compilation and loading, GPU memory management, and GPU/CPU synchronization. A write-up is available at www.ati.com/developer. If you are interested in obtaining the system for evaluation, please contact researcher@ati.com.

Sh Version 0.8rc0 Released

November 10th, 2005

Sh Version 0.8.0rc0, the first release candidate for the upcoming Sh 0.8, is now available. There are plenty of  new features and bug fixes, but most importantly this release has an API that completely matches the book Metaprogramming GPUs with Sh, which the 0.8.x series of releases will stick to. (http://libsh.org)

Sh Version 0.7.8 Released

July 1st, 2005

A new version of the Sh language for GPU programming in C++ has been released. This version features a new backend infrastructure implementation allowing such things as running part of a stream application on the GPU and part on the CPU at the same time. Many other fixes as well as platform compatability enhancements were also added. (http://libsh.org)

Sh Version 0.7.7 released

April 27th, 2005

Version 0.7.7 of the Sh GPU Metaprogramming Language is now released. Sh allows GPUs to be programmed directly using C++. This version features a back end for the OpenGL Shading Language, Mac OS X support, and major speed improvements for stream programs (the GPGPU subset of Sh). (http://libsh.org)

Scout: A Hardware-Accelerated System for Quantitatively Driven Visualization and Analysis

October 20th, 2004

This IEEE Visualization 2004 paper by McCormick et al. describes the Scout System and Language that allow the GPU to be programmed for scientific visualization. Scout uses a data parallel language that allows the user to program visual mappings from data values to the final rendered result. These techniques can be used to replace standard user interface components, such as the transfer function editor commonly used in volume rendering. (“Scout: A Hardware-Accelerated System for Quantitatively Driven Visualization and Analysis”, Patrick S. McCormick, Jeff Inman, James P. Ahrens, Chuck Hansen and Greg Roth, In Proceedings IEEE Visualization 2004, pages 171-178, October 2004.)

Cg 1.3 Beta 2 Released

August 19th, 2004

Cg Release 1.3 Beta 2 has been released with support for the latest GeForce 6 Series (NV4X) GPUs. This version of Cg offers the following features and improvements:

  • New vp40 profile, which enables texture sampling from within vertex programs
  • New fp40 profile, which provides a robust branching model in fragment programs, and support for output to multiple draw buffers (“MRTs”)
  • Support for writing more than one color output (i.e., MRTs) in the arbfp1 and ps_2* profiles
  • New semantics to access OpenGL fixed-function state vectors from within ARB_vertex_program and ARB_fragment_program
  • New “-fastprecision” option for arbfp*, fp30, and fp40 profiles, to use reduced precision storage (fp16) when appropriate
  • Support for 16 profiles


Ashli 1.4.0 released

June 25th, 2004

ATI’s Ashli version 1.4.0 has been released and is available for download from: Ashli Home. Ashli is a toolkit intended to assist developers exploring programmable shading on GPUs. It supports a reasonable subset of OpenGL (GLSL), Microsoft’s DirectX (HLSL) and RenderMan shading languages. Ashli’s significant contribution is in hardware resource virtualization, segmenting a complex shader program into GPU realizable streams. The posted Ashli viewer application demonstrates the use of shader partitions in a multi-pass rendering context. Ashli outputs both metadata and code, orthogonal to any of the languages supported. Targets include OpenGL ARB_vertex_program and ARB_fragment_program, and DirectX 9.0 Vertex Shader and Pixel Shader versions 2.0 and 2.X API’s. Optionally, Ashli emits a unified Microsoft FX file format, embedding progressive techniques of state and code sections. (Ashli 1.4.0)

Site News: New Brook for GPUs Forums Added

May 27th, 2004

In cooperation with the creators of BrookGPU, GPGPU.org has added discussion forums for beginner and general/advanced Brook topics. Brook users of all levels can use these forums to discuss questions, experiences, and other information with other Brook users and with the developers of BrookGPU.

Brook for GPUs: Stream Computing on Graphics Hardware

May 24th, 2004

This SIGGRAPH 2004 paper by Buck et al. presents Brook for GPUs, a system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming coprocessor. The paper presents a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, the paper provides analysis of the effectiveness of the GPU as a compute engine compared to the CPU, to determine when the GPU can outperform the CPU for a particular algorithm. The paper evaluates the system with five applications, the SAXPY and SGEMV BLAS operators, image segmentation, FFT, and ray tracing. For these applications, the Brook implementations perform comparably to hand-written GPU code and up to seven times faster than their CPU counterparts. (Brook for GPUs: Stream Computing on Graphics Hardware. Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, and Pat Hanrahan. To appear at SIGGRAPH 2004.)

Page 4 of 512345