Swarm-NG: integration of an ensemble of N-body systems

July 29th, 2010

The Swarm-NG package helps scientists and engineers harness the power of GPUs. In the early releases, Swarm-NG will focus on the integration of an ensemble of N-body systems evolving under Newtonian gravity. Swarm-NG does not replicate existing libraries that calculate forces for large-N systems on GPUs, but rather focuses on integrating an ensemble of many systems where N is small. This is of particular interest for astronomers who study the chaotic evolution of planetary systems. In the long term, we hope Swarm-NG will allow for the efficient parallel integration of user-defined systems of ordinary differential equations.

CfP: Game Engines Gems 2

April 5th, 2010

After a very successful launch of the first volume of the Game Engine Gems series at GDC 2010, Jones and Bartlett Publishers is now accepting proposals for the second volume. The paper submission period for Game Engine Gems 2 is now open through June 15, 2010. To submit a proposal, please visit the official website.

As with the first volume, the theme of the book includes everything having to do with game engine design and implementation. Specific topics of interest include rendering techniques, shaders, OpenGL / DirectX, physics / collision detection, mathematics, programming techniques, engine architecture, visibility determination, audio, user interface, input devices, memory management, artificial intelligence, resource organization, and cross-platform considerations. This list is not exhaustive, and the editors are happy to evaluate any idea that pertains to making game engines.

Direct N-body Kernels for Multicore Platforms

January 24th, 2010

From the abstract:

We present an inter-architectural comparison of single- and double-precision direct n-body implementations on modern multicore platforms, including those based on the Intel Nehalem and AMD Barcelona systems, the Sony-Toshiba-IBM PowerXCell/8i processor, and NVIDA Tesla C870 and C1060 GPU systems. We compare our implementations across platforms on a variety of proxy measures, including performance, coding complexity, and energy efficiency.

Nitin Arora, Aashay Shringarpure, and Richard Vuduc. “Direct n-body kernels for multicore platforms.” In Proc. Int’l. Conf. Parallel Processing (ICPP), Vienna, Austria, September 2009 (direct link to PDF).

DirectCompute fluid simulation, waves and fractals demos

January 3rd, 2010

This web site, maintained by Jan Vlietinck, provides sample programs with full source code written for DirectCompute Shaders. Examples include interactive 3D Navier-Stokes and Laplace wave equation solvers and fractal renderers.  The Laplace simulator runs at interactive rates for a 400x400x400 volume, and the Navier-Stokes solver at 200x200x200, including visualization.

OpenCurrent v1.0 released: CUDA-accelerated PDE solver

September 28th, 2009

OpenCurrent is an open source C++ library for solving Partial Differential Equations (PDEs) over regular grids using the CUDA platform from NVIDIA. It breaks down a PDE into 3 basic objects, “Grids”, “Solvers,” and “Equations.” “Grid” data structures efficiently implement regular 1D, 2D, and 3D arrays in both double and single precision. Grids support operations like computing linear combinations, managing host-device memory transfers, interpolating values at non-grid points, and performing array-wide reductions. “Solvers” use these data structures to calculate terms arising from discretizations of PDEs, such as finite-difference based advection and diffusion schemes, and a multigrid solver for Poisson equations. These computational building blocks can be assembled into complete “Equation” objects that solve time-dependent PDEs. One such Equation solver is an incompressible Navier-Stokes solver that uses a second-order Boussinesq model. This equation solver is fully validated, and has been used to study Rayleigh-Benard convection under a variety of different regimes. Benchmarks show it to perform about 8 times faster than an equivalent Fortran code running on an 8-core Xeon.

Read the rest of this entry »

SAPPORO: A way to turn your graphics cards into a GRAPE-6

March 11th, 2009

Abstract:

In this paper, the authors present a library, named Sapporo, which closely emulates the GRAPE-6 API. The library is written in CUDA and implements most common functions that are used in N-body codes supporting GRAPE-6. As a result such codes will be able to use Sapporo without modification to their source code. The library also supports use of multiple GPUs per host. The authors carried out a series systematic tests to test the performance, accuracy and ability of the library to handle a realistic N-body problem. They found the performance of the library with a single G80/G92 GPU is a factor of two higher than that of GRAPE-6A(BLX) PCI(X)-cards, and the sustained performance with 2x GeForce 9800GX2 cards is on par with a 32-chip GRAPE-6 system (about 800 GFlop/s). The accuracy of the library is comparable to that of GRAPE-6 hardware, and its ability to correctly solve a realistic N-body problem provides an alternative for GRAPE-6 special purpose hardware.

(Evghenii Gaburov, Stefan Harfst and Simon Portegies Zwart, SAPPORO: A way to turn your graphics cards into a GRAPE-6, Submitted to New Astronomy)

Toward efficient GPU-accelerated N-body simulations

January 18th, 2008

Abstract: “N-body algorithms are applicable to a number of common problems in computational physics including gravitation, electrostatics, and fluid dynamics. Fast algorithms (those with better than O(N2) performance) exist, but have not been successfully implemented on GPU hardware for practical problems. In the present work, we introduce not only best-in-class performance for a multipole-accelerated treecode method, but a series of improvements that support implementation of this solver on highly-data-parallel graphics processing units (GPUs). The greatly reduced computation times suggest that this problem is ideally suited for the current and next generations of single and cluster CPU-GPU architectures. We believe that this is an ideal method for practical computation of largescale turbulent flows on future supercomputing hardware using parallel vortex particle methods. (Mark J. Stock and Adrin Gharakhani, “Toward efficient GPU-accelerated N-body simulations,” in 46th AIAA Aerospace Sciences Meeting and Exhibit, AIAA 2008-608, January 2008, Reno, Nevada.)

Acceleration of a 3D Euler Solver Using Commodity Graphics Hardware

January 18th, 2008

Abstract:

The porting of two- and three-dimensional Euler solvers from a conventional CPU implementation to the novel target platform of the Graphics Processing Unit (GPU) is described. The motivation for such an effort is the impressive performance that GPUs offer: typically 10 times more floating point operations per second than a modern CPU, with over 100 processing cores and all at a very modest financial cost. Both codes were found to generate the same results on the GPU as the FORTRAN versions did on the CPU. The 2D solver ran up to 29 times quicker on the GPU than on the CPU; the 3D solver 16 times faster.

(Tobias Brandvik and Graham Pullan, Acceleration of a 3D Euler Solver Using Commodity Graphics Hardware. 46th AIAA Aerospace Sciences Meeting and Exhibit. January, 2008.)

Interactive Simulation of Large Scale Agent-Based Models (ABMs) on the GPU

January 16th, 2008

This article by D’Souza et al. explores large scale Agent-Based Model(ABM) simulation on the GPU. Agent-based modeling is a technique which has become increasingly popular for simulating complex natural phenomena such as swarms and biological cell colonies. An ABM describes a dynamic system by representing it as a collection of communicating, concurrent objects. Current ABM simulation toolkits and algorithms use discrete event simulation techniques and are executed serially on a CPU. This limits the size of the models that can be handled efficiently. In this paper we present a series of efficient data-parallel algorithms for simulating ABMs. These include methods for handling environment updates, agent interactions and replication. Important techniques presented in this work include a novel stochastic allocator which enables parallel agent replication in O(1) average time and an iterative method to handle collision among agents in the spatial domain. These techniques have been implemented on a modern GPU (GeForce 8800GTX), resulting in a substantial performance increase. The authors believe that their system is the first completely GPU-based ABM simulation framework. (D’Souza R., Lysenko, M., Rahmani, K., SugarScape on steroids: simulating over a million agents at interactive rates. Proceedings of the Agent2007 conference, Chicago, IL. 2007.)

Havok and NVIDIA present Havok FX at GDC 2006

March 17th, 2006

At GDC 2006 in San Jose next week Havok will announce Havok FX, a game physics framework for GPUs. There are two talks about Havok FX:

Havok FX: GPU-accelerated Physics for PC Games
Speaker: Andrew Bond (Havok)
This session introduces Havok’s latest innovation for game physics: Havok FX, which enables real-time processing of thousands of rigid-body objects on current and next generation GPUs. Havok’s general approach to GPU Effects Physics will be covered, as well as tool-chain requirements and trade-offs with game-critical, game-play physics processing on the CPU.

Physics Simulation on NVIDIA GPUs
Speakers: Simon Green, Mark Harris (NVIDIA)
Havok FX leverages state of the art software and hardware technology from NVIDIA to extend the capabilities of NVIDIA GPUs and SLI multi-GPU systems to include physics processing for massive real-time effects. In this presentation NVIDIA and Havok engineers will describe how Havok FX utilizes NVIDIA technology to simulate and render thousands of particles and rigid bodies in games. Live real-time demos will demonstrate the high performance available with current GPUs and provide a look into the future of physics processing on NVIDIA GPUs.

Page 1 of 41234