June 14th, 2009
June 8th, 2009
R is a popular open source environment for statistical computing, widely used in many application domains. The ongoing R+GPU project is devoted to moving frequently used R functions, mostly functions used in biomedical research, to the GPU using CUDA. If a CUDA-compatible GPU and driver are present on a user’s machine, the user may only need to prefix “gpu” to the original function name to take advantage of the GPU implementation of the corresponding R function.
Speedup measurements of the current implementation range as high as 80x, and contributions to the code base are cordially invited. R+GPU is developed at the University of Michigan’s Molecular and Behavioral Neuroscience Institute
June 8th, 2009
NVIDIA is offering a series of free GPU computing webinars covering a range of topics from a basic introduction to the CUDA architecture to advanced topics such as data structure optimization and multi-GPU usage.
There are several webinars scheduled already; attendees are encouraged to pick the date and time which best suits their schedule. Visit the NVIDIA GPU Computing Online Seminars webpage for webinar registration and further information. Additional webinars will be scheduled throughout the next few months so check for future alerts and visit the NVIDIA online seminar schedule page often.
June 4th, 2009
NVIDIA NVPP is a library of functions for performing CUDA accelerated processing. The initial set of functionality in the library focuses on imaging and video processing and is widely applicable for developers in these areas. NVPP will evolve over time to encompass more of the compute heavy tasks in a variety of problem domains. The NVPP library is written to maximize flexibility, while maintaining high performance.
NVPP can be used in one of two ways:
- A stand-alone library for adding GPU acceleration to an application with minimal effort. Using this route allows developers to add GPU acceleration to their applications in a matter of hours.
- A cooperative library for interoperating with a developer’s GPU code efficiently.
Either route allows developers to harness the massive compute resources of NVIDIA GPUs, while simultaneously reducing development times. The NVPP API matches the Intel Performance Primitives (IPP) library API so that porting existing IPP code to the GPU is easy to do. For more information and to sign up for access to the beta release of NVPP, visit the NVPP website.
May 31st, 2009
F2C-ACC is a language translator to convert codes from Fortran into C and C for CUDA. The goal of this project is to reduce the time to convert and adapt existing large-scale Fortran applications to run on CUDA-accelerated clusters, and to reduce the effort to maintain both Fortran and CUDA implementations. Both translations are useful: C can be used for testing and as a base code for running on the IBM Cell processor, and the generated C for CUDA code serves as a basis for running on the GPU. The current implementation does not support all language constructs yet, but the generated human-readable code can be used as a starting point for further manual adaptations and optimizations.
F2C-ACC is developed by Mark Govett et al. at the NOAA Earth System Research Laboratory, and has been presented at the Path to Petascale NCSA/UIUC workshop on applications for accelerators and accelerator clusters.
May 25th, 2009
Thrust is an open-source template library for data parallel CUDA applications featuring an interface similar to the C++ Standard Template Library (STL). Thrust provides a flexible high-level interface for GPU programming that greatly enhances developer productivity while remaining high performance. Note that Thrust supersedes Komrade, the initial release of the library, and all future development will proceed under this title.
Thrust is open source under the Apache 2.0 license and available now at http://thrust.googlecode.com. Download Thrust and check out the Thrust tutorial to get started.
The thrust::host_vector and thrust::device_vector containers simplify memory management and transfers between host and device. Thrust provides efficient algorithms for:
- sorting – thrust::sort and thrust::sort_by_key
- transformations – thrust::transform
- reductions – thrust::reduce and thrust::transform_reduce
- scans – thrust::inclusive_scan and thrust::transform_inclusive_scan
- And many more!
Read the rest of this entry »
May 25th, 2009
MemtestG80 is a software-based tester to test for “soft errors” in GPU memory or logic for NVIDIA CUDA-enabled GPUs. It uses a variety of proven test patterns (some custom and some based on Memtest86) to verify the correct operation of GPU memory and logic. It is a useful tool to ensure that given GPUs do not produce “silent errors” which may corrupt the results of a computation without triggering an overt error.
Precompiled binaries for Windows, Linux and OSX, as well as the source code, are available for download under the LGPL license. MemtestG80 is developed by Imran Haque and Vijay Pande.
May 12th, 2009
GPUmat, developed by the GP-You Group, allows Matlab code to benefit from the compute power of modern GPUs. It is built on top of NVIDIA CUDA. The acceleration is transparent to the user, only the declaration of variables needs to be changed using new GPU-specific keywords. Algorithms need not be changed. A wide range of standard Matlab functions have been implemented. GPUmat is available as freeware for Windows and Linux from the GP-You download page.
May 4th, 2009
A half-day workshop and discussion forum will be held from 8:45-13:00, Wednesday May 27, in Lecture theatre 3 of the Alan Gilbert Building at The University of Melbourne, Victoria, Australia. A light lunch will be supplied afterwards from 13:00-14:00. With speakers from NVIDIA and Xenon Systems, this workshop is hosted by the ARC Centre of Excellence for Mathematics and Statistics of Complex Systems (MASCOS), and the Department of Mathematics and Statistics at the University of Melbourne.
Due to recent advances in GPU hardware and software, so called general-purpose GPU computing (GPGPU) is rapidly expanding from niche applications to the mainstream of high performance computing. For HPC researchers, hardware gains have increased the imperative to learn this new computing paradigm, while high level programming languages (in particular, CUDA) have decreased the barrier to entry to this field, so that it is now possible for new developers to rapidly port suitable applications from C/C++ running on CPUs to CUDA running on GPUs. For appropriate applications, GPUs have significant, even dramatic, advantages compared to CPUs in terms of both Dollars/FLOPS and Watts/FLOPS.
For more information see the workshop announcement.
April 29th, 2009
Barra, developed by Sylvain Collange, Marc Daumas, David Defour and David Parello from Université de Perpignan, simulates CUDA programs at the assembly language level (NVIDIA PTX ISA). Its ultimate goal is to provide a 100% bit-accurate simulation, offering bug-for-bug compatibility with NVIDIA G80-based GPUs. It works directly with CUDA executables; neither source modification nor recompilation is required. Barra is primarily intended as a tool for research on computer architecture, although it can also be used to debug, profile and optimize CUDA programs at the lowest level. For more details and downloads, see the Barra wiki. A technical report is also available.
A GPU computing workshop and discussion forum will be held at the UWA University Club Thursday, May 7th. The workshop aims to provide a detailed introduction to GPU computing with CUDA and NVIDIA Tesla computing solutions, and to present research in GPU and Heterogeneous computing being undertaken in Western Australia.
Mark Harris (NVIDIA) will present an introduction to the CUDA architecture, programming model, and the programming environment of C for CUDA, as well as an overview of the Tesla GPU architecture, a live programming demo, and strategies for optimizing CUDA applications for the GPU. To better enable the uptake of this technology, Dragan Dimitrovici from Xenon Systems will provide an overview of CUDA enabled hardware options. The workshop will also include brief presentations of some of the projects using CUDA within Western Australia, including a presentation from Professor Karen Haines (WASP@UWA) on parallel computing strategies required for optimizing applications for GPU and heterogeneous computing.
Please see the workshop flyer for full details.