Boost.Compute v0.2 has been released! Boost.Compute is a header-only C++ library for GPGPU and parallel-computing based on OpenCL. It is available on GitHub and instructions for getting started can be found in the documentation. Since version 0.1 (released almost two months ago) new algorithms including unique(), search() and find_end() have been added, along with several bug fixes. See the project page on GitHub for more information: https://github.com/kylelutz/compute
A new version of the GPU-profiler for CUDA software stack is available at www.lab4241.com. The GPU-profiler is able to deliver per C++ source-code ‘inside’ kernel performance information in a simple, intuitive way, similar to known CPU domain profilers, like Quantify or Valgrind. The new version, GPUPROF version 0.3 (beta), includes improved stability, refined memory tracing, temporal memory analysis, and CUDA API-driver call tracing.
This webinar covers how Geoweb3d uses the GPU for real-time geospatial 3D visualization, modeling, and analytics. Geoweb3D will demonstrate how native, high resolution datasets including GIS, CAD, 3D Models, LIDAR, and FMV are fused together in real-time with game quality graphics and pixel accurate analysis. The 3D engine uses a GPU resident mesh that adapts to any resolution data on the fly eliminating the need to preprocess any data prior to real-time use. Demonstration will include Geoweb3d Mobile which now uses HTML5 for use on any device in the cloud including phones and tablets.
To register follow this link: https://www2.gotomeeting.com/register/226039466
A new version of the rCUDA middleware has been released (version 4.1). In addition to fix some bugs related with asynchronous memory transfers, the new release provides support for:
- CUDA 5.5 Runtime API
- Mellanox Connect-IB network adapters
- Dynamic Parallelism
- cuFFT and cuBLAS libraries
The rCUDA middleware allows to seamlessly use, within your cluster, GPUs that are installed in computing nodes different from the one that is executing the CUDA application, without requiring to modify nor recompile your program. Please visit www.rcuda.net for more details about the rCUDA technology.
This blog post explains GPU Boost, a new user controllable feature available on Tesla GPUs. Case studies and benchmarks for reverse time migration and an electromagnetic solver are discussed.
This hands-on four day course will teach you how to write applications in OpenCL that fully leverage the multi-core processing capabilities of the GPU. Taught by Acceleware developers who bring real world experience to the class room, students will benefit from:
- Hands-on exercises and progressive lectures
- Individual laptops with AMD Fusion APU for student use
- Small class sizes to maximize learning
- 90 days post training support
For more information please visit: http://acceleware.com/training/1028
PARALUTION is a library for sparse iterative methods which can be performed on various parallel devices, including multi-core CPU, GPU (CUDA and OpenCL) and Intel Xeon Phi. The new 0.6.0 version provides the following new features:
- Windows support (OpenMP backend)
- FGMRES (Flexible GMRES)
- (R)CMK (Cuthill–McKee) ordering
- Thread-core affiliation (for Host OpenMP)
- Asynchronous transfers (CUDA backend)
- Pinned memory allocation on the host when using CUDA backend
- Verbose output for debugging
- Easy to handle timing function in the examples
PARALUTION 0.6.0 is available at http://www.paralution.com.
The new free open-source PyViennaCL 1.0.0 release provides the Python bindings for the ViennaCL linear algebra and numerical computation library for GPGPU and heterogeneous systems. ViennaCL itself is a header-only C++ library, so these bindings make available to Python programmers ViennaCL’s fast OpenCL and CUDA algorithms, in a way that is idiomatic and compatible with the Python community’s most popular scientific packages, NumPy and SciPy. Support through the Google Summer of Code 2013 for the primary developer Toby St Clere Smithe is greatly appreciated.
More information and download: PyViennaCL Home
This tutorial by Dan Cyca outlines the shared memory configurations for NVIDIA Fermi and Kepler architectures, and demonstrates how to rewrite kernels to take advantage of the changes in Kepler’s shared memory architecture.
OpenCLIPP is a library providing processing primitives (image processing primitives in the first version) implemented with OpenCL for fast execution on dedicated computing devices like GPUs. Two interfaces are provided: C (similar to the Intel IPP and NVIDIA NPP libraries) and C++. OpenCLIPP is free for personal and commercial use. It can be downloaded from GitHub.
M. Akhloufi, A. Campagna, “OpenCLIPP: OpenCL Integrated Performance Primitives library for computer vision applications”, Proc. SPIE Electronic Imaging 2014, Intelligent Robots and Computer Vision XXXI: Algorithms and Techniques, P. 9025-31, February 2014.