Simulation and Visualization of the Saint-Venant System using GPUs

May 13th, 2010

Abstract:

We consider three high-resolution schemes for computing shallow-water waves as described by the Saint-Venant system and discuss how to develop highly efficient implementations using graphical processing units (GPUs). The schemes are well-balanced for lake-at-rest problems, handle dry states, and support linear friction models. The first two schemes handle dry states by switching variables in the reconstruction step, so that that bilinear reconstructions are computed using physical variables for small water depths and conserved variables elsewhere. In the third scheme, reconstructed slopes are modified in cells containing dry zones to ensure non-negative values at integration points. We discuss how single and double-precision arithmetics affect accuracy and efficiency, scalability and resource utilization for our implementations, and demonstrate that all three schemes map very well to current GPU hardware. We have also implemented direct and close-to-photo-realistic visualization of simulation results on the GPU, giving visual simulations with interactive speeds for reasonably-sized grids.

(A. R. Brodtkorb, T. R. Hagen, K.-A. Lie and J. R. Natvig: “Simulation and Visualization of the Saint-Venant System using GPUs”. In review, February 2010. Link to PDF preprint, Youtube video)

Palix Technologies launches ANDSolver beta program

March 23rd, 2010

Palix Technologies has introduced a new Computational Fluid Dynamics (CFD) product called ANDSolver that has been designed from the ground up to use Graphics Processing Units (GPUs) for fast and efficient aerodynamic analysis. Although developing and running applications to use multiple CPUs is a well established practice for high performance science and engineering simulations, a newer trend towards using GPUs for computation promises faster results with lower hardware acquisition and operating costs. ANDSolver delivers on that promise with up to a 10x speedup compared to a typical quad core CPU. This level of performance is unique in that it is achieved on unstructured meshes which have traditionally not been considered amenable to GPUs because of the memory access patterns. However, based on an innovative algorithm design to maximize the performance of the NVIDIA CUDA architecture, the ease and flexibility of unstructured meshing can now be used on high-performance, cost-effective GPUs.

A limited number of additional registrants will be accepted prior to our first production release in Q2 2010. More information can be found at http://www.palixtech.com for our current beta testing program.

Lattice-Boltzmann Simulation of the Shallow-Water Equations with Fluid-Structure Interaction on Multi- and Manycore Processors

February 28th, 2010

Abstract:

We present an efficient method for the simulation of laminar fluid flows with free surfaces including their interaction with moving rigid bodies, based on the two-dimensional shallow water equations and the Lattice-Boltzmann method. Our implementation targets multiple fundamentally different architectures such as commodity multicore CPUs with SSE, GPUs, the Cell BE and clusters. We show that our code scales well on an MPI-based cluster; that an eightfold speedup can be achieved using modern GPUs in contrast to multithreaded CPU code and, finally, that it is possible to solve fluid-structure interaction scenarios with high resolution at interactive rates.

(Markus Geveler, Dirk Ribbrock, Dominik Göddeke and Stefan Turek: “Lattice-Boltzmann Simulation of the Shallow-Water Equations with Fluid-Structure Interaction on Multi- and Manycore Processors”, Accepted in: Facing the Multicore Challenge, Heidelberg, Germany, Mar. 2010. Link.)

HONEI: A collection of libraries for numerical computations targeting multiple processor architectures

February 2nd, 2010

Abstract:

We present HONEI, an open-source collection of libraries offering a hardware oriented approach to numerical calculations. HONEI abstracts the hardware, and applications written on top of HONEI can be executed on a wide range of computer architectures such as CPUs, GPUs and the Cell processor. We demonstrate the flexibility and performance of our approach with two test applications, a Finite Element multigrid solver for the Poisson problem and a robust and fast simulation of shallow water waves. By linking against HONEI’s libraries, we achieve a two-fold speedup over straight forward C++ code using HONEI’s SSE backend, and additional 3–4 and 4–16 times faster execution on the Cell and a GPU. A second important aspect of our approach is that the full performance capabilities of the hardware under consideration can be exploited by adding optimised application-specific operations to the HONEI libraries. HONEI provides all necessary infrastructure for development and evaluation of such kernels, significantly simplifying their development.

(Danny van Dyk, Markus Geveler, Sven Mallach, Dirk Ribbrock, Dominik Göddeke and Carsten Gutwenger: HONEI: A collection of libraries for numerical computations targeting multiple processor architectures. Computer Physics Communications 180(12), pp. 2534-2543, December 2009. DOI 10.1016/j.cpc.2009.04.018)

A fast two-dimensional floodplain inundation model

November 25th, 2009

This paper in the Proceedings of the Institution of Civil Engineers describes an application of GPGPU for flood risk modelling by a team based at JBA Consulting in the UK. The model described here has since been used to produce flood risk maps for several countries in Europe.

Abstract:

“Two-dimensional (2D) flood inundation modelling is now an important part of flood risk management practice. Research in the fields of computational hydraulics and numerical methods, allied with advances in computer technology and software design, have brought 2D models into mainstream use. Even so, the models are computationally demanding and can take a long time to run, especially for large areas and at high spatial resolutions (for instance 2 × 2 m or smaller grid cells). There is thus strong motivation to accelerate 2D model codes. This paper demonstrates the use of technology from the computer graphics industry to accelerate a 2D diffusion wave (non-inertial) floodplain model. Over the past decade the market for computer games has driven the development of very fast, relatively low-cost ‘graphical processing units’. In recent years there has been a growing interest in this high-performance graphics hardware for scientific and engineering applications. This work adapted a flood model algorithm to run on a commodity personal computer graphics card. The results of a benchmark urban flood simulation were reproduced and the model run time reduced from 18 h to 9·5 min.”

(Lamb, R., Crossley, A. and Waller, S. 2009. A fast two-dimensional floodplain inundation model. Proceedings of the Institution of Civil Engineers – Water Management, Volume 162, Issue 6, pages 363–370. DOI: 10.1680/wama.2009.162.6.363)

OpenCurrent v1.0 released: CUDA-accelerated PDE solver

September 28th, 2009

OpenCurrent is an open source C++ library for solving Partial Differential Equations (PDEs) over regular grids using the CUDA platform from NVIDIA. It breaks down a PDE into 3 basic objects, “Grids”, “Solvers,” and “Equations.” “Grid” data structures efficiently implement regular 1D, 2D, and 3D arrays in both double and single precision. Grids support operations like computing linear combinations, managing host-device memory transfers, interpolating values at non-grid points, and performing array-wide reductions. “Solvers” use these data structures to calculate terms arising from discretizations of PDEs, such as finite-difference based advection and diffusion schemes, and a multigrid solver for Poisson equations. These computational building blocks can be assembled into complete “Equation” objects that solve time-dependent PDEs. One such Equation solver is an incompressible Navier-Stokes solver that uses a second-order Boussinesq model. This equation solver is fully validated, and has been used to study Rayleigh-Benard convection under a variety of different regimes. Benchmarks show it to perform about 8 times faster than an equivalent Fortran code running on an 8-core Xeon.

Read the rest of this entry »

ISC 2009 CUDA/OpenCL Tutorial Slides Posted

June 25th, 2009

A tutorial on High Performance Computing with CUDA was held at the International Conference on Supercomputing in Hamburg on Monday, June 22nd 2009.  The tutorial included an introduction to the CUDA programming model and C for CUDA, along with details on the CUDA Toolkit, Libraries, and optimization.  The tutorial also provided an introduction to OpenCL, and finished with a case study on Computational Fluid Dynamics by Dr. Graham Pullan from Cambridge University.  Slides from the tutorial are now posted here on GPGPU.org.

(Massimiliano Fatica, Timo Stich, and Graham Pullan.  High Performance Computing with CUDA.  Tutorial.  International Conference on Supercomputing 2009.  Hamburg, Germany.)

Real-Time Particle Level Sets with Application to Flow Visualization

May 24th, 2007

This technical report by N. Cuntz, R. Strzodka and A. Kolb describes a particle level set (PLS) system for fast and accurate surface tracking on the GPU. The technique demonstrates the coupling of grid and particle information by using vertex/fragment buffer objects, shaders and blending functionality in an innovative way. Improvements over the original PLS technique include a sub-voxel interface representation and a more accurate level set correction using more precise particle radii. As a concrete application the authors demonstrate that their fast and accurate PLS is well suited to the visualization of dynamic flows. An accurate evolution of time surfaces and representation of path volumes offer a more reliable basis for data interpretation. (Real-Time Particle Level Sets with Application to Flow Visualization. Technical report, 2007)

Page 2 of 212