Monte Carlo simulations on Graphics Processing Units

April 13th, 2009


Implementation of basic local Monte-Carlo algorithms on ATI Graphics Processing Units (GPU) is investigated. The Ising model and pure SU(2) gluodynamics simulations are realized with the Compute Abstraction Layer (CAL) of ATI Stream environment using the Metropolis and the heat-bath algorithms, respectively. We present an analysis of both CAL programming model and the efficiency of the corresponding simulation algorithms on GPU. In particular, the significant performance speed-up of these algorithms in comparison with serial execution is observed.

(Vadim Demchik, Alexei Strelchenko. Monte Carlo simulations on Graphics Processing Units. arXiv:0903.3053 [hep-lat].)

Molecular dynamics on NVIDIA GPUs with speed-ups up to two orders of magnitude

April 13th, 2009

ACEMD is a production-class bio-molecular dynamics (MD) simulation program designed specifically for GPUs which is able to achieve supercomputing scale performance of 40 nanoseconds /day for all-atom protein systems with over 23,000 atoms.  With GPU technology it has become possible to run a microsecond-long trajectory for an all-atom molecular system in explicit water on a single workstation computer equipped with just 3 GPUs. This performance would have required over 100 CPU cores.  Visit the project website for details.

(M. J. Harvey, G. Giupponi, G. De Fabritiis, ACEMD: Accelerating bio-molecular dynamics in the microsecond time-scale. Link to preprint.)

Path to Petascale: Adapting GEO/CHEM/ASTRO Applications for Accelerators and Accelerator Clusters

April 13th, 2009

The workshop “Path to PetaScale: Adapting GEO/CHEM/ASTRO Applications for Accelerators and Accelerator Clusters” was held at the National Center for Supercomputing Applications (NCSA), University of Illinois Urbana-Champaign, on April 2-3, 2009. This workshop, sponsored by NSF and NCSA, helped computational scientists in the geosciences, computational chemistry, and astronomy and astrophysics communities take full advantage of emerging high-performance computing accelerators such as GPUs and Cell processors. The workshop consisted of joint technology sessions during the first day and domain-specific sessions on the second day. Slides from the presentations are now online.

Second SHARCNET Symposium on GPU and Cell Computing

April 13th, 2009

University of Waterloo, Waterloo, Ontario, Canada
May 20th, 2009

This one-day symposium will explore the use of GPUs and Cell processors for accelerating scientific and high performance computing. The symposium program includes invited keynote presentations on large-scale fluid dynamics simulations using the Roadrunner supercomputer and acceleration of biomolecular modeling applications with GPU computing, as well as vendor research presentations from IBM, NVIDIA and RapidMind. Researchers working with these architectures are invited to contribute presentations and posters.

For further information and to register please visit the event website.

Efficient Sparse Matrix-Vector Multiplication on CUDA

April 13th, 2009

Abstract from an NVIDIA Technical Report by Nathan Bell and Michael Garland:

The massive parallelism of graphics processing units (GPUs) offers tremendous performance in many high-performance computing applications. While dense linear algebra readily maps to such platforms, harnessing this potential for sparse matrix computations presents additional challenges. Given its role in iterative methods for solving sparse linear systems and eigenvalue problems, sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra.

In this paper we discuss data structures and algorithms for SpMV that are efficiently implemented on the CUDA platform for the fine-grained parallel architecture of the GPU. Given the memory-bound nature of SpMV, we emphasize memory bandwidth efficiency and compact storage formats. We consider a broad spectrum of sparse matrices, from those that are well-structured and regular to highly irregular matrices with large imbalances in the distribution of nonzeros per matrix row. We develop methods to exploit several common forms of matrix structure while o ering alternatives which accommodate greater irregularity.

On structured, grid-based matrices we achieve performance of 36 GFLOP/s in single precision and 16 GFLOP/s in double precision on a GeForce GTX 280 GPU. For unstructured finite-element matrices, we observe performance in excess of 15 GFLOP/s and 10 GFLOP/s in single and double precision respectively. These results compare favorably to prior state-of-the-art studies of SpMV methods on conventional multicore processors. Our double precision SpMV performance is generally two and a half times that of a Cell BE with 8 SPEs and more than ten times greater than that of a quad-core Intel Clovertown system.

(Nathan Bell and Michael Garland. “Efficient Sparse Matrix-Vector Multiplication on CUDA“.  NVIDIA Technical Report NVR-2008-004, December 2008.)

NVIDIA GPU Computing Tutorial Webinar Series

April 8th, 2009

This series of free web seminars (“webinars”) starting April 15th 2009 will cover the basics of data-parallel computing on GPUs using NVIDIA’s CUDA architecture. Tutorials will be presented by the NVIDIA Developer Technology team and will cover many topics including C for CUDA, programming with the OpenCL API , using DirectX Compute and performance optimization techniques.

Webinar topics, schedules and registration information will be updated regularly. Pre-registration is required. Please follow the links provided (after clicking “read the rest of this entry”), and registration details will be emailed back upon successful registration. Read the rest of this entry »

The New

April 6th, 2009

We’re very happy today to announce the new! We’ve been spending many evenings and weekends building an all-new infrastructure for the website based on WordPress. This powerful platform will provide a much more stable, secure, and flexible experience for both visitors and editors.  More importantly, we have completely rewritten the developer pages, with new information on the latest GPU Computing languages, such as NVIDIA CUDA and ATI Stream.  We owe a debt of gratitude to Dominik Göddeke for organizing and editing the new developer pages, as well as for lots of help with testing.

Other new features include a news submission form, reader comments, a much better site search, and a cleaner, more modern design.  We’ve also simplified the news categories and added tags to provide additional metadata for news posts.  Everything you are used to is still here, including our popular forums.  We hope the result is a more positive experience for everybody.  If you find problems with the new site, such as broken links, improper formatting on your browser, or other problems, please report them using the news submission form. (Note, you can still access the old site.)

Applying graphics hardware to achieve extremely fast geometric pattern matching

April 5th, 2009


We present a GPU-based approach to geometric pattern matching. We reduce this problem to ?nding the depth (maximally covered point) of an arrangement of polytopes in transformation space and describe hardware-assisted (GPU) algorithms which exploit the available set of graphics operations to perform a fast rasterized depth computation.

(Applying graphics hardware to achieve extremely fast geometric pattern matching in two and three dimensional transformation space. Dror Aiger and Klara Kedem. Information Processing Letters. 2008.)

Equalizer BOF at Eurographics next week

March 31st, 2009

Equalizer Graphics will be holding an Equalizer Birds-Of-a-Feather meeting today during Eurographics’09

Place: Eurographics 2009, TU Munich
Date: Tuesday, March 31, 15:00-16:30
Room: MI 02.13.010

Co-located with EG is the Eurographics Symposium on Parallel Graphics and Visualization, so there is yet another good reason to attend.


  • 15:00-15:20 Equalizer: Past, Present and Future, Stefan Eilemann, Eyescale Software GmbH
  • 15:20-15:40 Virtual Architecture with Equalizer and OpenSceneGraph, Julia Sigmund, University of Siegen
  • 15:40-16:00 Performance Optimizations for Image Compositing, Renato Pajarola, University of Zurich
  • 16:00-16:30 Questions and Answers, Open Discussion

gDEBugger for Apple Mac OS X launched at GDC 2009

March 31st, 2009

Graphic Remedy launched the first official version of gDEBugger Mac at this year’s Game Developers Conference, held in San Francisco, 23-27 March. On Tuesday March 24, gDEBugger Mac was demonstrated in the Khronos Developer University full-day tutorial area. A fully functional trial version of gDEBugger Mac is now available for download.

gDEBugger is an OpenGL Debugger and Profiler. It traces application activity on top of the OpenGL API, lets programmers see what is happening within the graphics system implementation to find bugs and optimize OpenGL application performance.

gDEBugger Mac brings all of gDEBugger’s Debugging and Profiling abilities to the Mac OS X OpenGL developer’s world. gDEBugger now runs on Windows, Mac OS X and Linux operating systems.

Page 72 of 112« First...102030...7071727374...8090100...Last »