Supercomputing 2009 Tutorial: High-Performance Computing with CUDA

November 30th, 2009

The presentation slides from the Supercomputing 2009 full-day tutorial “High-Performance Computing with CUDA” are now available at


NVIDIA’s CUDA is a general-purpose architecture for writing highly parallel applications. CUDA provides several key abstractions—a hierarchy of thread blocks, shared memory, and barrier synchronization—for scalable high-performance parallel computing. Scientists throughout industry and academia use CUDA to achieve dramatic speedups on production and research codes. The CUDA architecture supports many languages, programming environments, and libraries including C, Fortran, OpenCL, DirectX Compute, Python, Matlab, FFT, LAPACK, etc.

In this tutorial NVIDIA engineers will partner with academic and industrial researchers to present CUDA and discuss its advanced use for science and engineering domains. The morning session will introduce CUDA programming, motivate its use with many brief examples from different HPC domains, and discuss tools and programming environments. The afternoon will discuss advanced issues such as optimization and sophisticated algorithms/data structures, closing with real-world case studies from domain scientists using CUDA for computational biophysics, fluid dynamics, seismic imaging, and theoretical physics.

NVIDIA’s October GPU Computing Webinars now open for registration

October 21st, 2009

These webinars cover many topics including an introduction to C for CUDA, the OpenCL™ API, and performance optimization techniques, presented by NVIDIA DevTech Engineers with additional staff online to answer questions.

Full Schedule and short abstracts can be viewed at:

GPGPU and MPI Advanced Training Courses

September 22nd, 2009

The Centre for Scientific Computing of the University of Cambridge is running advanced training courses this October covering GPGPU (with CUDA) and MPI. The courses offer a combination of lectures and hands-on tutorials. They will be lectured by Dr Mike Kirby, co-author (with George Karniadakis) of the textbook “Parallel Scientific Computing in C++ and MPI” (Cambridge University Press 2003).

At the end of the courses the participants will be qualified to start building their own parallel codes from scratch, or to further develop existing packages. More information is available on the course website.

General-Purpose GPU Programming Short Course

September 22nd, 2009

Cranfield University, UK is pleased to offer a brand new 3-day course which introduces the programming techniques required to develop general-purpose software applications for GPU hardware. Using NVIDIA’s CUDA framework, the course will focus on the solution to common problems encountered whilst developing numerical applications on the GPU. This will include an introduction to the programming techniques required to take advantage of the architecture, as well as more advanced optimisation methodologies needed to get the most out of the platform.

Topics include:

  • CUDA Programming Model
  • GPU Device Architecture
  • Performance Optimisation

The course is being held at Cranfield University between 30th November to 2nd December. More information is available on the course website.

SPEEDUP and PPAM Conference Tutorials Available

September 16th, 2009

Slides from two full-day conference tutorials are now available:

Both tutorials present basics and advanced topics of scientific computing on GPUs, including ready-to-use GPU libraries, GPU architecture, case studies and many hands-on examples.

NVIDIA’s GPU Technology Conference Announces Advanced Sessions on CUDA Programming

August 6th, 2009

The GPU Technology Conference will be held Sept 30-Oct 2, 2009 in San Jose, Calif. This event will focus on the latest breakthroughs that developers, engineers and researchers are achieving through the use of the GPU. Learn more at

Session abstracts and speakers can be found at under the Agenda page. Sessions announced to date include

  • Advanced C for CUDA
  • CUDA Fortran Programming for NVIDIA GPUs
  • What Every CUDA Programmer Needs to Know about OpenGL
  • Debugging tools for CUDA
  • Using CUDA within Mathematica
  • The TotalView Debugger for CUDA
  • OPLib: A GPL Library of Elementary Pricing Functions in CUDA/OpenCL and OpenMP
  • Par4All: Auto-Parallelizing C and Fortran for the CUDA Architecture

More sessions are to be announced.

Beyond Programmable Shading SIGGRAPH 2009 Course

August 6th, 2009

The course notes and supplementary material for “Beyond Programmable Shading”, a full-day course held at SIGGRAPH 2009 on August 6, are now available online.

This course is presented in two parts, Beyond Programmable Shading I and Beyond Programmable Shading II.

There are strong indications that the future of interactive graphics programming is a more flexible model than today’s OpenGL/Direct3D pipelines. Graphics developers need a basic understanding of how to combine emerging parallel programming techniques and more flexible graphics processors with the traditional interactive rendering pipeline. The first half of the course introduces the trends and directions in this emerging field. Topics include: parallel graphics architectures, parallel programming models for graphics, and game-developer investigations of the use of these new capabilities in future rendering engines.

The second half of the course has leaders from graphics hardware vendors, game development, and academic research present case studies that show how general parallel computation is being combined with the traditional graphics pipeline to boost image quality and spur new graphics algorithm innovation. Each case study discusses the mix of parallel programming constructs used, details of the graphics algorithm, and how the rendering pipeline and computation interact to achieve the technical goals. Read the rest of this entry »

NVIDIA’s August GPU Computing Webinars now open for registration.

August 4th, 2009

These webinars cover many topics including an introduction to C for CUDA, the OpenCL™ API, and performance optimization techniques, presented by NVIDIA DevTech Engineers with additional staff online to answer questions.

Full Schedule and short abstracts can be viewed at:

NVIDIA GPU Computing Tutorial Webinar Series covers C for CUDA, OpenCL, and DirectX Compute

July 20th, 2009

This series will cover the basics of data parallel computing on GPUs leveraging NVIDIA’s CUDA architecture. Tutorials will cover many topics including C for CUDA, programming to the OpenCL API, using Direct X Compute and performance optimization techniques, presented by NVIDIA Developer Technology Engineering team and NVIDIA staff online to answer questions.

All dates and time reference are for California, USA. Please follow the links to register for the webinars. The webinar system will send emails confirming and also reminding you of your registration. The webinars are repeated in the morning and evening California Pacific time, so that developers all over the world can choose a time that is appropriate for them.

Register here.

ISC 2009 CUDA/OpenCL Tutorial Slides Posted

June 25th, 2009

A tutorial on High Performance Computing with CUDA was held at the International Conference on Supercomputing in Hamburg on Monday, June 22nd 2009.  The tutorial included an introduction to the CUDA programming model and C for CUDA, along with details on the CUDA Toolkit, Libraries, and optimization.  The tutorial also provided an introduction to OpenCL, and finished with a case study on Computational Fluid Dynamics by Dr. Graham Pullan from Cambridge University.  Slides from the tutorial are now posted here on

(Massimiliano Fatica, Timo Stich, and Graham Pullan.  High Performance Computing with CUDA.  Tutorial.  International Conference on Supercomputing 2009.  Hamburg, Germany.)