This webinar provides an overview of the improved analysis performance tools available in CUDA 6.0 and key optimization strategies for compute, latency and memory bound problems. The webinar includes techniques for ensuring peak utilization of CUDA cores, how to improve branching efficiency, intrinsic functions and loop unrolling. Optimal access patterns for global and shared memory are presented, including a comparison between the Fermi and Kepler architectures. To view the webinar go to: http://acceleware.com/blog/webinar-essential-cuda-optimization-techniques
Developed in partnership with NVIDIA, this hands-on four day course will teach you how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. This course will have a finance focus. Commonly used algorithms such as random number generation and Monte Carlo simulations will be used and profiled in examples. A background in finance is not necessary. For more information please visit: http://acceleware.com/training/988
From a recent product announcement:
DeepCloud Whirlwind is an analytics only SQL database using modern GPUs for accelerated SQL processing. We see over 700x performance increase over a “well known” database on the same machine. Features include:
- column based storage
- vector processing
- SSD optimized
- smart compression – Ultra fast compression and decompression on the GPU
- MySQL like API – works with many MySQL client tools
- Oracle subset dialect
- data skipping
- zone maps
- fast schema-light data loading
Use Whirlwind database technology to get maximum database performance from significantly cheaper hardware or go all out with a state of the art system built from modern components. Beta avalable now under the GPL at: http://deepcloud.co
This tutorial will begin with a brief overview of OpenCL and data-parallelism before focusing on the GPU programming model. We will explore the fundamentals of GPU kernels, host and device responsibilities, OpenCL syntax and work-item hierarchy. For more information and to register visit: http://acceleware.com/event/introduction-opencl-using-amd-gpus
SpeedIT FLOW is a RANS single-phase fluid flow solver that runs fully on GPU. Benchmark results on external aero flow and other industry-relevant OpenFOAM cases on a GPU card indicate approximately 3x faster time to solution vs. Intel Xeon E5649 running 12 cores. This is about two times faster than competing solutions that offer only partial acceleration on GPU. More details are available on this blog.
This hands-on four day course teaches how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. More details and registration: http://acceleware.com/training/986
Hybrid Fortran is an Open Source directive based extension for the Fortran language. It is a way for HPC programmers to keep writing Fortran code like they are used to – only now with GPGPU support. It achieves performance portability by allowing different storage orders and loop structures for the CPU and GPU version. All computational code stays the same as in the respective CPU version, e.g. it can be kept in a low dimensionality even when the GPU version needs to be privatised in more dimensions in order to achieve a speedup. Hybrid Fortran takes care of the necessary transformations at compile-time (so there is no runtime overhead). A (python based) preprocessor parses these annotations together with the Fortran user code structure, declarations, accessors and procedure calls, and then writes separate versions of the code – once for CPU with OpenMP parallelization and once for GPU with CUDA Fortran. More details: http://typhooncomputing.com/?p=416
This blog entry provides an introduction to GPU virtualization, reviewing the five major technology vendors and their virtualization support for CUDA.
Partnering with NVIDIA, this four day CUDA training course, held in Houston is designed for programmers in the oil and gas industry who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the many-core processing capabilities of the GPU. Commonly used algorithms such as filtering and FFTs will be used and profiled in the examples. The case study on day 4 focuses on efficient implementation of a finite difference algorithm which is highly applicable to reverse time migration. However a background in oil and gas is not necessary. For more information and to view a copy of the course outline please visit: http://acceleware.com/training/987
Join the free webinar on May 20th devoted to accelerating orthorectification, atmospheric correction, and transformations for big data with GPUs. Learn how GPU capabilities can improve time for processing large imagery 50-100 times faster. Amanda O’Connor, a Senior Solutions Engineer at Exelis will walk you through implementation of GPU processing for large imagery datasets, operational use of GPU processing for orthorectification and share benchmarks against desktop algorithms. To register follow this link: https://www2.gotomeeting.com/register/665929994.