Swarm-NG: integration of an ensemble of N-body systems

July 29th, 2010

The Swarm-NG package helps scientists and engineers harness the power of GPUs. In the early releases, Swarm-NG will focus on the integration of an ensemble of N-body systems evolving under Newtonian gravity. Swarm-NG does not replicate existing libraries that calculate forces for large-N systems on GPUs, but rather focuses on integrating an ensemble of many systems where N is small. This is of particular interest for astronomers who study the chaotic evolution of planetary systems. In the long term, we hope Swarm-NG will allow for the efficient parallel integration of user-defined systems of ordinary differential equations.

HOOMD-blue 0.9.0 released

May 20th, 2010

HOOMD-blue stands for Highly Optimized Object-oriented Many-particle Dynamics — Blue Edition. It performs general-purpose particle dynamics simulations on a single workstation, taking advantage of  NVIDIA GPUs to attain a level of performance equivalent to dozens of processor cores on a fast cluster.

HOOMD-blue 0.9.0 is a major new release. Highlights include:

  • Support for Fermi generation GPUs
  • Performance enhancements
  • New pair potentials
  • Particle data is now accessible from hoomd scripts
  • Binary format dump files for simulation restarts
  • Numerous small enhancements to enable easily restartable jobs
  • 2D simulations are now possible
  • Integration methods can now be applied to specified groups of particles
  • All IMD commands issued by VMD are now understood
  • and more

HOOMD-blue 0.9.0 is available for download under an open source license.

CLyther 0.1 Beta Released

April 25th, 2010

GeoSpin has released the first version of CLyther for beta testing. Please visit the CLyther SourceForge website for more information.  CLyther enables developers to seamlessly write GPGPU code completely in python with no additional syntax. CLyther’s core driver contains a python compiler to convert Python functions and types to OpenCL during runtime.

CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL such as:

  • OpenCL interface similar to PyOpenCL
  • Dynamic compilation of OpenCL code at runtime
  • Fast prototyping of OpenCL code
  • Create OpenCL code using the Python language definition
  • Passing functions as arguments to OpenCL kernels
  • Pure Python emulation mode of kernel functions

Read the rest of this entry »

Thrust v1.2 Released

March 23rd, 2010

Version 1.2 of Thrust, an open-source template library for developing CUDA applications, has been released. Modeled after the C++ Standard Template Library (STL), Thrust brings a familiar abstraction layer to the realm of GPU computing. This version adds several new features, including:

The Thrust web page provides a quick-start guide, online documentation, many examples and introductory slides. Thrust is open-source software distributed under the OSI-approved Apache License v2.0.

Yellow Dog Enterprise Linux for CUDA

March 9th, 2010

Yellow Dog Enterprise Linux for CUDA (YDEL for CUDA) is an open source, Linux operating system built for faster, easier, and more reliable GPU Computing. YDEL for CUDA, released and supported by Fixstars, goes beyond the basic Linux OS and integrates support for GPUs, NVIDIA CUDA, and GPU development tools.

From the YDEL for CUDA website:

Key benefits of Yellow Dog Enterprise Linux for CUDA:

  • YDEL for CUDA users can experience up to a 9% performance improvement in some applications.
  • Comprehensive support is offered to paid subscriptions with our skilled team able to assist you with both Linux and CUDA.
  • YDEL’s unparalleled integrations means everything you need to write and run CUDA applications is included and configured.
  • YDEL includes multiple versions of CUDA and can easily switch between them via a setting in a configuration file or an environment variable.
  • Never worry about updates affecting your system, Fixstars offers YDEL users greater reliability with our strenuous test procedures that validate GPU computing functionality and performance.

For more information, visit the YDEL for CUDA website.

CLyther = Python + OpenCL

March 9th, 2010

CLyther is an under-development python tool for OpenCL similar to Cython for C. CLyther is a python language extension intended to make writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features for OpenCL.

CLyther exposes both the OpenCL C library and language to python. It’s features include:

  • Fast prototyping of OpenCL code.
  • OpenCL kernel function creation using the Python language definition.
  • Strong OOP programming in OpenCL code.
  • Passing functions as arguments to kernel functions.
  • Python emulation mode for OpenCL code.
  • Fancy indexing of arrays.
  • Dynamic compilation at runtime.

Read the rest of this entry »

Easy GPU programming with GMAC

March 1st, 2010

GMAC (Global Memory for ACcelerators) is a user-level library that implements an Asymmetric Distributed Shared Memory model to be used by CUDA programs. An ADSM model allows CPU code to access data hosted in accelerator (GPU) memory. In this model, a single pointer is used for data structures accessed both in the CPU and the GPU and the coherency of the data is transparently handled by the library. Moreover, the data allocated with GMAC can be accessed by all the host threads of the program. That makes your code simpler and cleaner. GMAC currently supports programs programmed with CUDA, but OpenCL support is planned.

A paper describing the Asymmetric Distributed Shared Memory model and its implementation in GMAC has been accepted in the ASPLOS XV conference. GMAC is being developed by the Operating System Group at the Universitat Politecnica de Catalunya and the IMPACT Research Group at the University of Illinois. Binary pre-compiled packages, the source code, documentation and examples are available at the project website.

(Isaac Gelado, Javier Cabezas, John Stone, Sanjay Patel, Nacho Navarro and Wen-mei Hwu,  “An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems”, accepted in: Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010), March 2010.)

WaveTomography v1.0: 2D waveform tomography reconstruction

February 21st, 2010

WaveTomography is a 2D time-domain waveform tomography reconstruction algorithm that can be run on graphics processing units. It features:

  • Wave propagation using leapfrog and ONADM schemes.
  • First order absorbing boundary conditions.
  • CPU only and CPU/GPU implementations.
  • Flexible reconstruction strategy (choice of emitters and receivers at each iteration).
  • Flexible imaging setup (choice of transducers’ positions).

The WaveTomography package also includes a standalone simulator for wave propagation. The source code can be freely downloaded.

(Roy, O., Jovanovic, I., Hormati, A., and Parhizkar, R., and Vetterli, M., “Sound speed estimation using wave-based ultrasound tomography: Theory and GPU implementation”, in Proc. SPIE Medical Imaging, 2010.)

OpenNL 3.0: CUDA sparse linear solvers

February 14th, 2010

OpenNL (Open Numerical Library) is a library for solving sparse linear systems, especially designed for the Computer Graphics community. The goal of OpenNL is to be as small as possible, while offering the subset of functionalities required by this application field. The Makefiles of OpenNL can generate a single .c and .h file that make it very easy to integrate into other projects. The distribution includes an implementation of a Least Squares Conformal Maps parameterization method.  The new version 3.0 of OpenNL includes support for CUDA (with Concurrent Number Cruncher and CUSP ELL formats).

CUDAEASY – a GPU Accelerated Cosmological Lattice Program

December 8th, 2009

Abstract:

This paper presents, to the author’s knowledge, the first graphics processing unit (GPU) accelerated program that solves the evolution of interacting scalar fields in an expanding universe. We present the implementation in NVIDIA’s Compute Unified Device Architecture (CUDA) and compare the performance to other similar programs in chaotic inflation models. We report speedups between one and two orders of magnitude depending on the used hardware and software while achieving small errors in single precision. Simulations that used to last roughly one day to compute can now be done in hours and this difference is expected to increase in the future. The program has been written in the spirit of LATTICEEASY and users of the aforementioned program should find it relatively easy to start using CUDAEASY in lattice simulations. The program is available under the GNU General Public License.

The program is freely available at http://www.physics.utu.fi/theory/particlecosmology/cudaeasy/

(Jani Sainio. “CUDAEASY – a GPU Accelerated Cosmological Lattice Program”. submitted to Computer Physics Communications (under review). November 2009.)

Page 1 of 3123