Accelerate OpenFOAM® with Culises

April 13th, 2015

Culises significantly accelerates your OpenFOAM® application by using GPUs for the computationally most intensive tasks.

Its main features are

  • Library for GPU-based acceleration of OpenFOAM®
  • Multi-GPU support, significantly reduced computing times
  • Highly efficient state-of-the-art iterative solvers like AMG
  • Quick and easy installation, no validation necessary
  • Flexible interfaces to customer-specific software/engineering applications available

The acceleration of the linear solver by Culises is greater than 2x. The overall speedup depends on the type of application and the time spent in the linear solver. Culises my be tested on FluiDyna’s purpose-built workstation to determine the acceleration potential for your individual OpenFOAM® application. Find out more on:

On Demand Webinar: Essential CUDA Optimization Techniques

November 3rd, 2014

This webinar provides an overview of the improved analysis performance tools available in CUDA 6.0 and key optimization strategies for compute, latency and memory bound problems. The webinar includes techniques for ensuring peak utilization of CUDA cores, how to improve branching efficiency, intrinsic functions and loop unrolling. Optimal access patterns for global and shared memory are presented, including a comparison between the Fermi and Kepler architectures. To view the webinar go to:

CUDA finance course Dec 2-5, 2014, New York

October 22nd, 2014

Developed in partnership with NVIDIA, this hands-on four day course will teach you how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. This course will have a finance focus. Commonly used algorithms such as random number generation and Monte Carlo simulations will be used and profiled in examples. A background in finance is not necessary. For more information please visit:

GPU Analytic SQL Database

September 12th, 2014

From a recent product announcement:

DeepCloud Whirlwind is an analytics only SQL database using modern GPUs for accelerated SQL processing. We see over 700x performance increase over a “well known” database on the same machine. Features include:

  • column based storage
  • vector processing
  • SSD optimized
  • smart compression – Ultra fast compression and decompression on the GPU
  • MySQL like API – works with many MySQL client tools
  • Oracle subset dialect
  • data skipping
  • zone maps
  • fast schema-light data loading

Use Whirlwind database technology to get maximum database performance from significantly cheaper hardware or go all out with a state of the art system built from modern components. Beta avalable now under the GPL at:

Webinar Sep. 17: An Introduction to OpenCL using AMD GPUs

September 12th, 2014

This tutorial will begin with a brief overview of OpenCL and data-parallelism before focusing on the GPU programming model. We will explore the fundamentals of GPU kernels, host and device responsibilities, OpenCL syntax and work-item hierarchy. For more information and to register visit:

SpeedIT FLOW: RANS single-phase fluid flow solver on GPU

September 4th, 2014

SpeedIT FLOW is a RANS single-phase fluid flow solver that runs fully on GPU. Benchmark results on external aero flow and other industry-relevant OpenFOAM cases on a GPU card indicate approximately 3x faster time to solution vs. Intel Xeon E5649 running 12 cores. This is about two times faster than competing solutions that offer only partial acceleration on GPU. More details are available on this blog.

CUDA Course Sept 23 – 26, 2014, Frankfurt

August 20th, 2014

This hands-on four day course teaches how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. More details and registration:

Performance Portable Parallel Programming – Target CUDA and OpenMP in a Unified Codebase

August 14th, 2014

Hybrid Fortran is an Open Source directive based extension for the Fortran language. It is a way for HPC programmers to keep writing Fortran code like they are used to – only now with GPGPU support. It achieves performance portability by allowing different storage orders and loop structures for the CPU and GPU version. All computational code stays the same as in the respective CPU version, e.g. it can be kept in a low dimensionality even when the GPU version needs to be privatised in more dimensions in order to achieve a speedup. Hybrid Fortran takes care of the necessary transformations at compile-time (so there is no runtime overhead). A (python based) preprocessor parses these annotations together with the Fortran user code structure, declarations, accessors and procedure calls, and then writes separate versions of the code – once for CPU with OpenMP parallelization and once for GPU with CUDA Fortran. More details:

State of GPU Virtualization for CUDA Applications 2014

August 14th, 2014

This blog entry provides an introduction to GPU virtualization, reviewing the five major technology vendors and their virtualization support for CUDA.

CUDA Course Sept 2-5, 2014, Houston

July 22nd, 2014

Partnering with NVIDIA, this four day CUDA training course, held in Houston is designed for programmers in the oil and gas industry who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the many-core processing capabilities of the GPU. Commonly used algorithms such as filtering and FFTs will be used and profiled in the examples. The case study on day 4 focuses on efficient implementation of a finite difference algorithm which is highly applicable to reverse time migration. However a background in oil and gas is not necessary. For more information and to view a copy of the course outline please visit:

Page 1 of 1312345...10...Last »