Yellow Dog Enterprise Linux for CUDA

March 9th, 2010

Yellow Dog Enterprise Linux for CUDA (YDEL for CUDA) is an open source, Linux operating system built for faster, easier, and more reliable GPU Computing. YDEL for CUDA, released and supported by Fixstars, goes beyond the basic Linux OS and integrates support for GPUs, NVIDIA CUDA, and GPU development tools.

From the YDEL for CUDA website:

Key benefits of Yellow Dog Enterprise Linux for CUDA:

  • YDEL for CUDA users can experience up to a 9% performance improvement in some applications.
  • Comprehensive support is offered to paid subscriptions with our skilled team able to assist you with both Linux and CUDA.
  • YDEL’s unparalleled integrations means everything you need to write and run CUDA applications is included and configured.
  • YDEL includes multiple versions of CUDA and can easily switch between them via a setting in a configuration file or an environment variable.
  • Never worry about updates affecting your system, Fixstars offers YDEL users greater reliability with our strenuous test procedures that validate GPU computing functionality and performance.

For more information, visit the YDEL for CUDA website.

CLyther = Python + OpenCL

March 9th, 2010

CLyther is an under-development python tool for OpenCL similar to Cython for C. CLyther is a python language extension intended to make writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features for OpenCL.

CLyther exposes both the OpenCL C library and language to python. It’s features include:

  • Fast prototyping of OpenCL code.
  • OpenCL kernel function creation using the Python language definition.
  • Strong OOP programming in OpenCL code.
  • Passing functions as arguments to kernel functions.
  • Python emulation mode for OpenCL code.
  • Fancy indexing of arrays.
  • Dynamic compilation at runtime.

Read the rest of this entry »

Easy GPU programming with GMAC

March 1st, 2010

GMAC (Global Memory for ACcelerators) is a user-level library that implements an Asymmetric Distributed Shared Memory model to be used by CUDA programs. An ADSM model allows CPU code to access data hosted in accelerator (GPU) memory. In this model, a single pointer is used for data structures accessed both in the CPU and the GPU and the coherency of the data is transparently handled by the library. Moreover, the data allocated with GMAC can be accessed by all the host threads of the program. That makes your code simpler and cleaner. GMAC currently supports programs programmed with CUDA, but OpenCL support is planned.

A paper describing the Asymmetric Distributed Shared Memory model and its implementation in GMAC has been accepted in the ASPLOS XV conference. GMAC is being developed by the Operating System Group at the Universitat Politecnica de Catalunya and the IMPACT Research Group at the University of Illinois. Binary pre-compiled packages, the source code, documentation and examples are available at the project website.

(Isaac Gelado, Javier Cabezas, John Stone, Sanjay Patel, Nacho Navarro and Wen-mei Hwu,  “An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems”, accepted in: Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010), March 2010.)

WaveTomography v1.0: 2D waveform tomography reconstruction

February 21st, 2010

WaveTomography is a 2D time-domain waveform tomography reconstruction algorithm that can be run on graphics processing units. It features:

  • Wave propagation using leapfrog and ONADM schemes.
  • First order absorbing boundary conditions.
  • CPU only and CPU/GPU implementations.
  • Flexible reconstruction strategy (choice of emitters and receivers at each iteration).
  • Flexible imaging setup (choice of transducers’ positions).

The WaveTomography package also includes a standalone simulator for wave propagation. The source code can be freely downloaded.

(Roy, O., Jovanovic, I., Hormati, A., and Parhizkar, R., and Vetterli, M., “Sound speed estimation using wave-based ultrasound tomography: Theory and GPU implementation”, in Proc. SPIE Medical Imaging, 2010.)

OpenNL 3.0: CUDA sparse linear solvers

February 14th, 2010

OpenNL (Open Numerical Library) is a library for solving sparse linear systems, especially designed for the Computer Graphics community. The goal of OpenNL is to be as small as possible, while offering the subset of functionalities required by this application field. The Makefiles of OpenNL can generate a single .c and .h file that make it very easy to integrate into other projects. The distribution includes an implementation of a Least Squares Conformal Maps parameterization method.  The new version 3.0 of OpenNL includes support for CUDA (with Concurrent Number Cruncher and CUSP ELL formats).

CUDAEASY – a GPU Accelerated Cosmological Lattice Program

December 8th, 2009

Abstract:

This paper presents, to the author’s knowledge, the first graphics processing unit (GPU) accelerated program that solves the evolution of interacting scalar fields in an expanding universe. We present the implementation in NVIDIA’s Compute Unified Device Architecture (CUDA) and compare the performance to other similar programs in chaotic inflation models. We report speedups between one and two orders of magnitude depending on the used hardware and software while achieving small errors in single precision. Simulations that used to last roughly one day to compute can now be done in hours and this difference is expected to increase in the future. The program has been written in the spirit of LATTICEEASY and users of the aforementioned program should find it relatively easy to start using CUDAEASY in lattice simulations. The program is available under the GNU General Public License.

The program is freely available at http://www.physics.utu.fi/theory/particlecosmology/cudaeasy/

(Jani Sainio. “CUDAEASY – a GPU Accelerated Cosmological Lattice Program”. submitted to Computer Physics Communications (under review). November 2009.)

HPMC open-source GPU volumetric iso-surface extraction library

November 30th, 2009

HPMC is a small OpenGL/C/C++-library that extracts iso-surfaces of volumetric data directly on the GPU.

The library analyzes a lattice of scalar values describing a scalar field that is either stored in a Texture3D or can be accessed through an application-provided snippet of shader code. The output is a sequence of vertex positions and normals that form a triangulation of the iso-surface. HPMC provides traversal code to be included in an application vertex shader, which allows direct extraction in the vertex shader. Using the OpenGL transform feedback mechanism, the triangulation can be stored directly into a buffer object.

(C. Dyken, G. Ziegler, C. Theobalt, H.-P. Seidel, High-speed Marching Cubes using Histogram Pyramids, Computer Graphics Forum 27 (8), 2008.)

Mersenne Twister for Graphic Processors (MTGP)

November 30th, 2009

MTGP is a new variant of the Mersenne Twister (MT) pseudorandom number generator introduced by Mutsuo Saito and Makoto Matsumoto in 2009. MTGP is designed to take advantage of some features of GPUs, such as parallel execution and hi-speed constant reference. It supports 32-bit and 64-bit integers, as well as single and double precision floating point as output.

MTGP v1.0 is available now.

OpenMM 1.0 beta Release

November 23rd, 2009

The 1.0 Beta version of OpenMM has just been released.  OpenMM is a freely downloadable, high performance, extensible library that allows molecular dynamics (MD) simulations to run on high performance computer architectures, such as graphics processing units (GPUs). It currently supports NVIDIA GPUs and provides preliminary support for the new cross-platform, parallel programming standard OpenCL, which will enable it to be used on ATI GPUs.

The new release includes support for Particle Mesh Ewald and custom non-bonded interactions.  In conjunction with this release, a new version of the code needed for accelerating the GROMACS molecular dynamics software using OpenMM is also available.

OpenMM is a collaborative project between Vijay Pande’s lab at Stanford University and Simbios, the National Center for Physics-based Simulation of Biological Structures at Stanford, which is supported by the National Institutes of Health. For more information on OpenMM, visit http://simtk.org/home/openmm.

Monte Carlo Simulation of Photon Migration in 3D Turbid Media Accelerated by Graphics Processing Units

November 23rd, 2009

Abstract:

We report a parallel Monte Carlo algorithm accelerated by graphics processing units (GPU) for modeling time-resolved photon migration in arbitrary 3D turbid media. By taking advantage of the massively parallel threads and low-memory latency, this algorithm allows many photons to be simulated simultaneously in a GPU. To further improve the computational efficiency, we explored two parallel random number generators (RNG), including a floating-point-only RNG based on a chaotic lattice. An efficient scheme for boundary reflection was implemented, along with the functions for time-resolved imaging. For a homogeneous semi-infinite medium, good agreement was observed between the simulation output and the analytical solution from the diffusion theory. The code was implemented with CUDA programming language, and benchmarked under various parameters, such as thread number, selection of RNG and memory access pattern. With a low-cost graphics card, this algorithm has demonstrated an acceleration ratio above 300 when using 1792 parallel threads over conventional CPU computation. The acceleration ratio drops to 75 when using atomic operations. These results render the GPU-based Monte Carlo simulation a practical solution for data analysis in a wide range of diffuse optical imaging applications, such as human brain or small-animal imaging.

(Qianqian Fang and David A. Boas, “Monte Carlo Simulation of Photon Migration in 3D Turbid Media Accelerated by Graphics Processing Units,” Opt. Express, vol. 17, issue 22, pp. 20178-20190 (2009), doi:10.1364/OE.17.020178 , link to full-text PDF

A free software, Monte Carlo eXtreme (MCX), is also available at http://mcx.sourceforge.net.)

Page 1 of 3123