Webinar: Accelerating Full Waveform Inversion via OpenCL on AMD GPUs

February 26th, 2014

On March 5 at 11:00am (PST), Acceleware hosts a webinar on accelerating a seismic algorithm on a cluster of AMD GPU compute nodes. The presentation will begin with an outline of the full waveform inversion (FWI) algorithm, followed by an introduction to OpenCL. The OpenCL programming model and memory spaces will be introduced. Strategies for formulating the problem to take advantage of the massively parallel GPU architecture, and key optimizations techniques are discussed including coalescing and an iterative approach to handle the slices. Performance results for the GPU are compared to the CPU run times. Click here to register.

Allinea DDT with support for NVIDIA CUDA 5.5 and CUDA on ARM

November 13th, 2013

Allinea DDT is part of Allinea Software’s unified tools platform, which provides a single powerful and intuitive environment for debugging and profiling of parallel and multithreaded applications. It is widely used by computational scientists and scientific programmers to fix software defects of parallel applications running on hybrid GPU clusters and supercomputers. DDT 4.1.1 supports CUDA 5.5, C++11 and the GNU 4.8 compilers. Also introduced with Allinea DDT 4.1.1 is CUDA toolkit debugging support for ARMv7 architectures. More information: http://www.allinea.com

Libra 3.0 – GPGPU SDK on Mobiles and Tablets

November 13th, 2013

The Libra 3.0 Heterogeneous Cloud Computing SDK has recently been released by GPU Systems. It supports PC, Tablet and Mobile Devices and includes a new virtualizing function for cloud compute services of local and remote CPUs and GPUs. C/C++, Java, C# and Matlab are supported. Read the full press release here.

Fast JPEG codec from Fastvideo

September 22nd, 2013

Fastvideo have released their JPEG codec for NVIDIA GPUs. Peak performance of the codec reaches 6 GBytes per second and higher for images loadedfrom host RAM. For instance, a full-color 4K image with resolution 3840 x 2160 can be compressed by 10 times in merely 6 milliseconds on NVIDIA GeForce GTX Titan. More information: http://www.fastcompression.com

Back Testing of HFT Strategies with Xcelerit and GPUs

July 26th, 2013

Algorithmic trading has become ever more popular in recent years – accounting for approximately half of all European and American stock trades placed in 2012. The trading strategies need to be back-tested regularly using historical market data for calibration and to check the expected return and risk. This is a computationally demanding process that can take hours to complete. However, back-testing the strategies frequently intra-day can significantly increase the profits for the trading institution.

Read the rest of this entry »

Acceleration of iterative Navier-Stokes solvers on graphics processing units

July 14th, 2013


While new power-efficient computer architectures exhibit spectacular theoretical peak performance, they require specific conditions to operate efficiently, which makes porting complex algorithms a challenge. Here, we report results of the semi-implicit method for pressure linked equations (SIMPLE) and the pressure implicit with operator splitting (PISO) methods implemented on the graphics processing unit (GPU). We examine the advantages and disadvantages of the full porting over a partial acceleration of these algorithms run on unstructured meshes. We found that the full-port strategy requires adjusting the internal data structures to the new hardware and proposed a convenient format for storing internal data structures on GPUs. Our implementation is validated on standard steady and unsteady problems and its computational efficiency is checked by comparing its results and run times with those of some standard software (OpenFOAM) run on central processing unit (CPU). The results show that a server-class GPU outperforms a server-class dual-socket multi-core CPU system running essentially the same algorithm by up to a factor of 4.

See also supplementary materials and the follow up at http://vratis.com/blog/?p=7.

(Tadeusz Tomczak, Katarzyna Zadarnowska, Zbigniew Koza, Maciej Matyka and Łukasz Mirosław: “Acceleration of iterative Navier-Stokes solvers on graphics processing units”, International Journal of Computational Fluid Dynamics, accepted, July 2013. [DOI])

Acceleware 4 Day CUDA Course – San Jose

May 5th, 2013

Developed in partnership with NVIDIA, this hands-on four day course will teach students how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. Taught by Acceleware developers who bring real world experience to the class room, students will benefit from:

  • Hands-on exercises and progressive lectures
  • Individual laptops equipped with NVIDIA GPUs for student use
  • Small class sizes to maximize learning

July 29 – August 1, 2013, San Jose, CA, USA. More information: http://www.acceleware.com/training/913

Fast GPU Debayer Software

March 13th, 2013

The GPU Debayer software developed by Fastvideo can be used for demosaicing of raw 8-bit Bayer images to full-color 24-bit RGB format. The application employs the HQLI and DFPD algorithms and is tuned for NVIDIA GPUs, which results in very fast conversion, e.g., only 1.25 ms for Full HD image demosaicing on GeForce GTX 580. The software is freely available.

Amdahl Software announces the general availability of OpenCL CodeBench

February 7th, 2013

From a recent press release:

Amdahl Software, a leading supplier of development tools for multi-core software, after extensive beta testing by evaluators over a dozen countries and numerous end-user application markets, today announced the production release of OpenCL CodeBench. OpenCL CodeBench is an OpenCL Code Creation tool. It simplifies parallel software development, enabling developers to rapidly generate and optimize OpenCL applications. Engineering productivity is increased through the automation of overhead tasks. The tools suite enables engineers to work at higher levels of abstraction, accelerating the code development process. OpenCL CodeBench benefits both expert and novice engineers through a choice of command line or guided, wizard-driven development methodologies. Close cooperation with IP, SOC and platform vendors will enable future releases of OpenCL CodeBench to more tightly optimize software for specific end user platforms and development environments.

OpenCL CodeBench is available for trial or purchase. For additional information, please visit www.amdahlsoftware.com.

Parallel Computing Training Dates from AccelerEyes

January 29th, 2013

AccelerEyes has released dates for their upcoming CUDA and OpenCL training courses.



More information can be found on the courses’ webpages.

Page 2 of 1312345...10...Last »