AMD offers an OpenCL Programming Webinar Series to help software developers become experts in the latest technologies, standards and best practices. The series of three OpenCL webinars will be presented by Rob Farber.
1. April 10th, 10AM PDT: Introducing Portable Parallelism
- C and C++ APIs
- OpenCL Memory Spaces
- The OpenCL Execution Model
2. April 24th, 10AM PDT: Coordinating OpenCL Computations on one more Heterogeneous Devices
- How to Concisley Utilize Multiple Command Queues and Coordinate Tasks Across Multiple Heterogeneous Devices such as two GPU + CPU
- Code Sample Discussion: Massively Parallel Random Number Test Framework
3. May 1st, 10AM PDT: Accelerate Rendering by an Order of Magnitude with OpenCL, Plus a View to the Multi-core and Web-enabled Future
- How to use OpenCL to Provide High-Quality, Fast Rendering in Combination with Primitive Restart
- Device Fission, Partitioning Hardware Capabilities for Optimal Resource Usage
- Looking to the Future – WebCL
Registration is limited. More Information: http://developer.amd.com/zones/OpenCLZone/Events/pages/OpenCLWebinars.aspx
PORTLAND, Ore., March 5 — The Portland Group, a wholly-owned subsidiary of STMicroelectronics, today announced availability of the 2012 release of the PGI line of high-performance parallelizing compilers and development tools for Linux, OS X and Windows. PGI 2012 is the first general release to include support for the OpenACC directive-based programming model for NVIDIA CUDA-enabled Graphics Processing Units (GPUs). This release is also the first to include the fully feature-enabled PGI CUDA C/C++ compiler for multi-core x64 CPUs from Intel and AMD. In addition, PGI 2012 includes a number of performance and feature enhancements for multi-core x64 processor-based HPC systems.
This Dr. Dobb’s Article by Rob Farber provides a tutorial on creating application plugins to accelerate Windows and Linux application performance using CUDA in dynamically loaded libraries.
Adding GPU capabilities to existing Windows and Linux apps can be done simply using plugins and the built-in support found in CUDA. This easy form of dynamic loading enables CUDA to be used selectively to hugely accelerate individual tasks within a larger application.
CUDA is maturing to become a natural extension of the emerging CPU/GPU paradigm of high-speed computing to make it, and GPU computing, a candidate for all application development. A recent article in this series tutorial series, Running CUDA Code Natively on x86 Processors, noted recent developments that allow CUDA programs to transparently compile and run on x86 processors. This article focuses on incorporating CUDA into Windows and Linux workflows by exploiting the capabilities of the NVIDIA compiler driver, nvcc, to create native runtime loadable plugins. Source code is provided to create and utilize CUDA plugins and even dynamically compile and link a CUDA source file into a running application (just like the OpenCL). Read the rest of this entry »
Developed in partnership with AMD, this four day course is designed for GPU Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the multi-core processing capabilities of the GPU.
Delivered by Acceleware’s Developers, who provide real world experience and examples, the training comprises classroom lectures and hands-on tutorials. Each student will be supplied with a laptop equipped with an AMD Fusion APU for the duration of the course. Small class sizes maximize learning and ensure a personal educational experience. Read the rest of this entry »
SpeedIT 2.0 and the SpeedIT plugin to OpenFOAM have been released. New features include:
- One of the fastest Sparse Matrix Vector Multiplication worldwide.
- Faster Conjugate Gradient and BiConjugate Gradient solvers.
- State-of-the-art CMRS format for storing sparse matrices. The format requires less memory than CRS or HYB (from CUSPARSE and CUSP).
- Faster acceleration in OpenFOAM (Computational Fluid Dynamics).
More information is available at http://speed-it.vratis.com.
Partnering with NVIDIA and Microsoft, this four-day CUDA training course is designed for GPU Programmers in the oil-and-gas industry who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the many-core processing capabilities of the GPU.
Partnering with NVIDIA and Microsoft, this four-day CUDA training course is designed for GPU Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the many-core processing capabilities of the GPU.
Chai is a new managed platform for GPGPU. It is a free and open source clean room workalike of the PeakStream platform. While not production-ready, the just-released alpha version is able to compile and run non-trivial PeakStream demo code on AMD and NVIDIA GPUs (e.g. conjugate gradient).
Chai combines an application virtual machine, garbage collection, auto-tuning JIT compiler, and high level array programming language implemented as an embedded domain-specific language in C++. The JIT back-end uses expectation-maximization to auto-tune and generate vectorized OpenCL. The JIT includes auto-tuned model families for GEMM and GEMV. Although originally developed for AMD GPUs, these parameterized kernel families also generalize to NVIDIA GPUs.
OpenCL Studio integrates OpenCL and OpenGL into a single development environment for high performance computing. The feature rich editor, interactive scripting language and extensible plug-in architecture support the rapid development of complex parallel algorithms and accompanying visualizations. Version 2.0 now conforms to the Lua plug-in architecture and closely integrates the open-source libCL parallel algorithm library. A complete version of OpenCL Studio is freely available for download at www.opencldev.com, including instructional videos and technology showcases.
VMD is a popular molecular visualization and analysis program used by thousands of researchers worldwide. VMD accelerates many of the most computationally demanding visualization and analysis features using GPU computing techqniques, resulting in improved performance and new capabilities beyond what is possible using only conventional multi-core CPUs. VMD 1.9.1 advances these capabilities further with a CUDA implementation of the new QuickSurf molecular surface representation, enabling smooth interactive animation of moderate sized biomolecular complexes consisting of a few hundred thousand to one million atoms, and allowing interactive display of molecular surfaces for static structures of very large complexes containing tens of millions of atoms, e.g. large virus capsids.
More information: http://www.ks.uiuc.edu/Research/vmd/vmd-1.9.1/