February 2nd, 2014
January 15th, 2014
OpenCLIPP is a library providing processing primitives (image processing primitives in the first version) implemented with OpenCL for fast execution on dedicated computing devices like GPUs. Two interfaces are provided: C (similar to the Intel IPP and NVIDIA NPP libraries) and C++. OpenCLIPP is free for personal and commercial use. It can be downloaded from GitHub.
M. Akhloufi, A. Campagna, “OpenCLIPP: OpenCL Integrated Performance Primitives library for computer vision applications”, Proc. SPIE Electronic Imaging 2014, Intelligent Robots and Computer Vision XXXI: Algorithms and Techniques, P. 9025-31, February 2014.
January 15th, 2014
Developed in partnership with NVIDIA, this hands-on four day course will teach how to write and optimize applications that fully leverage the multi-core processing capabilities of the GPU. Benefits include:
- Hands-on exercises and progressive lectures
- Individual laptops equipped with NVIDIA GPUs for student use
- Small class sizes to maximize learning
- 90 days post training support – NEW!
February 25-28, 2014, Baltimore, MD, USA, details and registration.
December 30th, 2013
This webinar will demonstrate how real-world computational research in soft matter physics can be accelerated on a GPU-equipped desktop computer with the HOOMD-blue molecular dynamics software. A presentation of how to set up a simulation of a dense polymer liquid, and how to analyze and visualize the results is provided. There will be a demonstration of how self-assembled ordered structures of block copolymers emerge out of an initially disordered configuration. With external potentials, an artificially ordered phase can be produced as well. HOOMD-blue’s easy-to-use scripting interface and plug-ins are used to create a productive work-flow and extend its capabilities. As an advanced topic, there will be a discussion of how the upcoming version of HOOMD-blue can be used on compute clusters running on ten to hundreds of GPUs in parallel, to boost simulations of long polymer chains or large-scale systems.
January 21, 2014, 11:00 a.m. EST, Registration required.
December 30th, 2013
December 23rd, 2013
This webinar recording provides an overview of the profiling techniques and the tools available to help you optimize your code. It examines NVIDIA’s Visual Profiler and cuobjdump and highlight the various methods available for understanding the performance of CUDA program. The second part of the session focuses on debugging techniques and the tools available to help identify issues in kernels. The debugging tools provided in CUDA 5.5 including NSight and cuda-memcheck are discussed. The webinar recording can be accessed here.
December 20th, 2013
The latest release 1.5.0 of the free open source linear algebra library ViennaCL is now available for download. The library provides a high-level C++ API similar to Boost.ublas and aims at providing the performance of accelerators at a high level of convenience without having to deal with hardware details. Some of the highlights from the ChangeLog are as follows: Vectors and matrices of integers are now supported, multiple OpenCL contexts can be used in a fully multi-threaded manner, products of sparse and dense matrices are now available, and certain BLAS functionality is also provided through a shared library for use with programming languages other than C++, e.g. C, Fortran, or Python.
November 28th, 2013
This webinar explores the memory model of the GPU and the memory enhancements available in the Kepler architecture, and how these will affect performance optimization. The webinar begins with an essential overview of GPU architecture and thread cooperation before focusing on the different memory types available on the GPU. We define shared, constant and global memory and discuss the best locations to store your application data for optimized performance. The shuffle instruction, new shared memory configurations and Read-Only Data Cache of the Kepler architecture are introduced and optimization techniques discussed. Click here to view the webinar recording.
November 20th, 2013
G-BLASTN is a GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST. G-BLASTN can produce exactly the same results as NCBI-BLAST, and it also has very similar user commands. It also supports a pipeline mode, which can fully use the GPU and CPU resources when handling a batch of medium to large sized queries. Currently, G_BLASTN supports the blastn and megablast modes of NCBI-BLAST. The discontiguous megablast mode is not supported yet. More information: http://www.comp.hkbu.edu.hk/~chxw/software/G-BLASTN.html
November 13th, 2013
VexCL is a modern C++ library created for ease of GPGPU development with C++. VexCL strives to reduce the amount of boilerplate code needed to develop GPGPU applications. The library provides a convenient and intuitive notation for vector arithmetic, reduction, sparse matrix-vector multiplication, etc. The source code is available under the permissive MIT license. As of v1.0.0, VexCL provides two backends: OpenCL and CUDA. Users may choose either of those at compile time with a preprocessor macro definition. More information is available at the GitHub project page and release notes page.
AMD CodeXL is a free set of tools for GPU debugging, GPU profiling, static analysis of OpenCL kernels, and CPU profiling, including support for remote servers. For more information and download links, see: http://developer.amd.com/community/blog/2013/11/08/codexl-1-3-released/
Bolt is an STL compatible C++ template library for creating data-parallel applications using C++ (no C++ AMP / OpenCL code required). For more information about the Bolt template library and download links, see: http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/bolt-c-template-library/
AMD APP SDK has everything needed to get started with OpenCL and parallel programming. It includes OpenCL samples that are very easy to compile, as well as the Bolt and other libraries. For more information about AMD APP SDK and download links, see: http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/