This tutorial by Dan Cyca outlines the shared memory configurations for NVIDIA Fermi and Kepler architectures, and demonstrates how to rewrite kernels to take advantage of the changes in Kepler’s shared memory architecture.
This webinar recording provides an overview of the profiling techniques and the tools available to help you optimize your code. It examines NVIDIA’s Visual Profiler and cuobjdump and highlight the various methods available for understanding the performance of CUDA program. The second part of the session focuses on debugging techniques and the tools available to help identify issues in kernels. The debugging tools provided in CUDA 5.5 including NSight and cuda-memcheck are discussed. The webinar recording can be accessed here.
The Virtual School of Computational Science and Engineering is hosting two upcoming webinars.
- Introduction to HOOMD-blue, December 10, 2013, 11:00 EST.
- Using HOOMD-blue for Polymer Simulations and Big Systems, January 21, 2014, 11:00 EST.
More information and registration: http://www.vscse.org/
One of the keys to achieving maximum performance in CUDA is taking advantage of the various memory spaces. Part II of Acceleware’s tutorial has now been published. The tutorial uses a simple encryption kernel to test and compare read-only cache, constant cache and global memory. Read the full tutorial…
This blog takes a closer look at constant cache and read-only cache. It highlights the differences between the two memory types and what circumstances they perform best in. Read the whole story here.
Acceleware recently announced a couple of courses:
- CUDA for Finance: December 10 – 13, 2013, New York, NY [Details]
- OpenCL: October 22 – 25, 2013, Houston, TX [details]
- CUDA: September 24-27, [Details]
- C++ AMP: September 10-13, [Details]
Developed in partnership with NVIDIA, this hands-on four day course will teach students how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. Taught by Acceleware developers who bring real world experience to the class room, students will benefit from:
- Hands-on exercises and progressive lectures
- Individual laptops equipped with NVIDIA GPUs for student use
- Small class sizes to maximize learning
July 29 – August 1, 2013, San Jose, CA, USA. More information: http://www.acceleware.com/training/913
This webinar will present CUDA, focusing on practical aspects. The webinar will be conducted by APC, supported by NVIDIA. The webinar will be held Thursday, May 16, 2013 at 11:00-12:00 am Moscow time. Participants are asked to register at https://attendee.gotowebinar.com/register/8697482572284069888
The LEAP (Low-energy application parallelism) conference hosts an interactive tutorial on applying formal analysis and verification techniques to OpenCL and CUDA kernels on Wed 22nd May 2013 in London,UK. Whether working on kernels for supercomputing, finance or mobile applications this tutorial will help developers overcome the common pitfalls in GPU programming such as data races and barrier divergence. Using plenty of worked examples and demos to encourage interactive discussion this session will highlight the practical benefits of using formal verification techniques to prove that kernels are free from defects. More information: http://www.leapconf.com