NVIDIA First to Roll out OpenCL Drivers & SDK

April 20th, 2009

From an NVIDIA Press Release:

SANTA CLARA, CA—APRIL 20, 2009—NVIDIA Corporation, the inventor of the GPU, today announced the release of its OpenCL driver and software development kit (SDK) to developers participating in its OpenCL Early Access Program. NVIDIA is providing this release to solicit early feedback in advance of a beta release which will be made available to all GPU Computing Registered Developers in the coming months.

Developers can apply to become a GPU Computing Registered Developer at: www.nvidia.com/opencl

“The OpenCL standard was developed on NVIDIA GPUs and NVIDIA was the first company to demonstrate OpenCL code running on a GPU,” said Tony Tamasi, senior vice president of technology and content at NVIDIA. “Being the first to release an OpenCL driver to developers cements NVIDIA’s leadership in GPU Computing and is another key milestone in our ongoing strategy to make the GPU the soul of the modern PC.”

At the core of NVIDIA®’s GPU Computing strategy is the massively parallel CUDA™ architecture that NVIDIA pioneered and has been shipping since 2006. Accessible today through familiar industry standard programming environments such as C, Java, Fortran and Python, the CUDA architecture supports all manner of computational interfaces and, as such, is a perfect complement to OpenCL. Enabled on over 100 million NVIDIA GPUs, the CUDA architecture is enabling developers to innovate with the GPU and unleash never before seen performance across a wide range of applications.

Developers can apply to become a GPU Computing Registered Developer at: www.nvidia.com/opencl

University of New South Wales Workshop on GPU Computing with CUDA

April 20th, 2009

Update: Slides from the UNSW GPU Computing Workshop are now available at the workshop website.

This half-day Workshop on High Performance GPU Computing with NVIDIA CUDA will be hosted by the Computer Science & Engineering department of the University of New South Wales, Sydney, Australia next Friday, April 17, 2009.  The workshop will provide an introduction to the CUDA architecture, programming model, and the programming environment of C for CUDA, as well as an overview of the Tesla GPU architecture, a live programming demo, and strategies for optimizing CUDA applications for the GPU. The workshop will also include a brief presentation of some of the projects using CUDA within the School of Computer Science and Engineering, UNSW, and of the hardware requirements for getting started with CUDA. The speakers are Mark Harris (NVIDIA), Manuel Chakravarty (UNSW), and Dragan Dimitrovici (Xenon Systems).  Registration is free, but mandatory, and the number of seats is limited to 50. For more information and registration details, visit the workshop webpage.

38th SPEEDUP Workshop on High-Performance Computing

April 15th, 2009

The 2009 SPEEDUP workshop will focus on Multicore computing and Parallel Languages. Topics include, but are not limited to OpenCL, NVIDIA CUDA, the Cell processor and GPU Computing. The event will take place in Lausanne, Switzerland, on September 7 and 8, 2009. The second day features a tutorial on GPU Computing with NVIDIA CUDA, organized by Dominik Göddeke (TU Dortmund), Robert Strzodka (Max Planck Institute Informatik) and Christian Sigg (NVIDIA).

Workshop on Architecture-aware Simulation and Computing

April 15th, 2009

The 2009 workshop on Architecture-aware Simulation and Computing, held in conjunction with the 2009 International Conference on High Performance Computing & Simulation (HPCS 2009), will include a couple of talks on GPU computing. Please see the workshop website for more information. Registration information and the full conference program will be available soon.

Minisymposium and Tutorial on GPU Computing at PPAM 2009

April 15th, 2009

The paper deadline for the Minisymposium on GPU Computing at the 8th International Conference on Parallel Processing and Applied Mathematics has been extended to April 30. The minisymposium is organized by Jose R. Herrero, Enrique S. Quintana-Orti and Robert Strzodka, and will take place September 13-16 2009, in Wroclaw, Poland.

PPAM is also happy to announce a full day tutorial on GPU Computing, organized by Robert Strzodka and Dominik Göddeke. The program and list of speakers will be available soon.

eResearch South Australia Workshop: High Performance GPU Computing with NVIDIA CUDA

April 14th, 2009

This workshop,  hosted by eResearch SA and to be presented by Mark Harris (NVIDIA) with Dragan Dimitrovici (Xenon Systems), aims to provide a detailed introduction to GPU computing with CUDA and NVIDIA GPUs such as the Tesla series of high-performance computing processors.

The workshop will be held from 9:00-13:00 on Tuesday 28th April, in the Henry Ayers Room, Ayers House
288 North Terrace, Adelaide (opposite the Royal Adelaide Hospital).

CUDA is NVIDIA’s revolutionary parallel computing architecture for GPUs. The available software tools include a C compiler for developers to build applications, as well as useful libraries for high-performance computing (BLAS, FFT, etc). Several widely-used scientific applications have been ported to run on GPUs using CUDA. This half-day workshop will provide an introduction to the CUDA architecture, programming model, and the programming environment of C for CUDA, as well as an overview of the Tesla GPU architecture, a live programming demo, and strategies for optimizing CUDA applications for the GPU. The workshop will also include a brief presentation of some of the current NVIDIA hardware offerings for GPU computing using CUDA.

The workshop is free, but space is limited. For complete details and registration, visit the workshop web page or download the brochure.

Efficient Acceleration of Asymmetric Cryptography on Graphics Hardware

April 13th, 2009

Abstract from the paper:

We present implementations of large integer modular exponentiation, the core of public-key cryptosystems such as RSA, on a DirectX 10 compliant GPU. We present high performance modular exponentiation implementations based on integers represented in both standard radix form and residue number system form. We show how a GPU implementation of a 1024-bit RSA decrypt primitive can outperform a comparable CPU implementation by up to 4 times and also improve the performance of previous GPU implementations by decreasing latency by up to 7 times and doubling throughput. We present how an adaptive approach to modular exponentiation involving implementations based on both a radix and a residue number system gives the best all-around performance on the GPU both in terms of latency and throughput. We also highlight the usage criteria necessary to allow the GPU to reach peak performance on public key cryptographic operations.

(Owen Harrison, John Waldron. Efficient Acceleration of Asymmetric Cryptography on Graphics Hardware. AfricaCrypt 2009, June 21-25, 2009, Gammarth, Tunisia. To Appear.)

Optimizing Sparse Matrix-Vector Multiplication on GPUs

April 13th, 2009

In this paper, the various challenges in developing a high-performance SpMV kernel on NVIDIA GPUs using the CUDA programming model are evaluated, and optimizations are proposed to effectively address them. The optimizations include: (1) exploiting synchronization-free parallelism, (2) optimized thread mapping based on the affinity towards optimal memory access pattern, (3) optimized off-chip memory access to tolerate the high access latency, and (4) exploiting data locality and reuse. The authors evaluate these optimizations on two classes of NVIDIA GPUs, namely, GeForce 8800 GTX and GeForce GTX 280, and compare the performance of their approach with that of existing parallel SpMV implementations such as (1) the SpMV library of Bell and Garland, (2) the CUDPP library, and (3) an implementation using an optimized segmented scan primitive. Their approach outperforms the CUDPP and segmented scan implementations by a factor of 2 to 8, and achieves up to 15% improvement over Bell and Garland’s SpMV library (Dec 8, 2008 version).

(Muthu Manikandan Baskaran; Rajesh Bordawekar. Optimizing Sparse Matrix-Vector Multiplication on GPUs. IBM Technical Report RC24704. 2008.)

Monte Carlo simulations on Graphics Processing Units

April 13th, 2009

Abstract:

Implementation of basic local Monte-Carlo algorithms on ATI Graphics Processing Units (GPU) is investigated. The Ising model and pure SU(2) gluodynamics simulations are realized with the Compute Abstraction Layer (CAL) of ATI Stream environment using the Metropolis and the heat-bath algorithms, respectively. We present an analysis of both CAL programming model and the efficiency of the corresponding simulation algorithms on GPU. In particular, the significant performance speed-up of these algorithms in comparison with serial execution is observed.

(Vadim Demchik, Alexei Strelchenko. Monte Carlo simulations on Graphics Processing Units. arXiv:0903.3053 [hep-lat].)

Molecular dynamics on NVIDIA GPUs with speed-ups up to two orders of magnitude

April 13th, 2009

ACEMD is a production-class bio-molecular dynamics (MD) simulation program designed specifically for GPUs which is able to achieve supercomputing scale performance of 40 nanoseconds /day for all-atom protein systems with over 23,000 atoms.  With GPU technology it has become possible to run a microsecond-long trajectory for an all-atom molecular system in explicit water on a single workstation computer equipped with just 3 GPUs. This performance would have required over 100 CPU cores.  Visit the project website for details.

(M. J. Harvey, G. Giupponi, G. De Fabritiis, ACEMD: Accelerating bio-molecular dynamics in the microsecond time-scale. Link to preprint.)

Page 44 of 85« First...102030...4243444546...506070...Last »