Acceleware parallel programming courses

January 25th, 2013

Acceleware has recently announced four courses on parallel programming:

More information is available on the courses’ webpages.

AMD CodeXL: comprehensive developer tool suite for heterogeneous compute

October 9th, 2012

AMD CodeXL is a new unified developer tool suite that enables developers to harness the benefits of CPUs, GPUs and APUs. It includes powerful GPU debugging, comprehensive GPU and CPU profiling, and static OpenCL™ kernel analysis capabilities, enhancing accessibility for software developers to enter the era of heterogeneous computing. AMD CodeXL is available for free, both as a Visual Studio® extension and a standalone user interface application for Windows® and Linux®.

AMD CodeXL increases developer productivity by helping them identify programming errors and performance issues in their application quickly and easily. Now developers can debug, profile and analyze their applications with a full system-wide view on AMD APU, GPU and CPUs.

AMD CodeXL user group (requires registration) allows users to interact with the CodeXL team, provide feedback, get support and participate in the beta surveys.

Implementing a code generator for fast matrix multiplication in OpenCL on the GPU

July 11th, 2012


This paper presents results of an implementation of code generator for fast general matrix multiply (GEMM) kernels. When a set of parameters is given, the code generator produces the corresponding GEMM kernel written in OpenCL. The produced kernels are optimized for high-performance implementation on GPUs from AMD. Access latencies to GPU global memory is the main drawback for high performance. This study shows that storing matrix data in a block-major layout increases the performance and stability of GEMM kernels. On the Tahiti GPU (Radeon HD 7970), our DGEMM (double-precision GEMM) and SGEMM (single-precision GEMM) kernels achieve the performance up to 848 GFlop/s (90% of the peak) and 2646 GFlop/s (70%), respectively.

(K. Matsumoto, N. Nakasato, S. G. Sedukhin: “Implementing a code generator for fast matrix multiplication in OpenCL on the GPU”, accepted for Special Session: Auto-Tuning for Multicore and GPU (ATMG), IEEE 6th International Symposium on Embedded Multicore SoCs (MCSoC-12), Sep. 2012. [PDF])

Acceleware OpenCL and CUDA Training

June 27th, 2012

Acceleware has announced two training courses:

Developed in partnership with AMD, this four day course, August 21-24,2012, is designed for GPU Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the multi-core processing capabilities of the GPU. Register before July 31 and receive $200 off your course fee! Enter promotional code AXTEB2012.

Partnering with NVIDIA, this four day course (July 17-20, 2012) is designed for Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the multi-core processing capabilities of the GPU.

Beyond3D C++ AMP contest

June 25th, 2012

Beyond3D’s first C++ AMP focused contest accepts submissions until August 31, 2012. The contest’s goal is to use parallel programming in order to speed up solving the Traveling Salesman’s Problem. All relevant details are provided on the contest’s dedicated page.

OpenCL Programming Webinar Series

March 30th, 2012

AMD offers an OpenCL Programming Webinar Series to help software developers become experts in the latest technologies, standards and best practices. The series of three OpenCL webinars will be presented by Rob Farber.

1. April 10th, 10AM PDT: Introducing Portable Parallelism

  • C and C++ APIs
  • OpenCL Memory Spaces
  • The OpenCL Execution Model

2. April 24th, 10AM PDT: Coordinating OpenCL Computations on one more Heterogeneous Devices

  • How to Concisley Utilize Multiple Command Queues and Coordinate Tasks Across Multiple Heterogeneous Devices such as two GPU + CPU
  • Code Sample Discussion: Massively Parallel Random Number Test Framework

3. May 1st, 10AM PDT: Accelerate Rendering by an Order of Magnitude with OpenCL, Plus a View to the Multi-core and Web-enabled Future

  • How to use OpenCL to Provide High-Quality, Fast Rendering in Combination with Primitive Restart
  • Device Fission, Partitioning Hardware Capabilities for Optimal Resource Usage
  • Looking to the Future – WebCL

Registration is limited. More Information:

Call for Presentations: AMD Fusion12 Developer Summit

January 26th, 2012

AMD Fusion ’12 will be held June 11-14, 2012 in Bellevue, Washington at the Meydenbauer Center and the Hyatt Regency. AMD invites pioneers of next-generation software and the rapidly growing field of heterogeneous computing to share their latest work and research findings in the form of presentations. Presenters will have an opportunity to advocate new methodologies and paradigms, garner support for industry standards, and network with developers, innovators and academics who will help define the course of this technology. Presentation proposals are invited on the following topics:

  • Web Technologies
  • Cloud Computing – Servers and Data Center
  • Gaming and Consumer Graphics
  • Heterogeneous Computing
  • Innovative Client Experiences
  • Multimedia Processing
  • Professional Graphics and Visual Computing
  • Programming Languages and Models
  • Programming Tools
  • Security Read the rest of this entry »

Aparapi – Parallel programming with Java and OpenCL

September 15th, 2011

AMD just released to open source a project called Aparapi that started in their JavaLabs team. Aparapi is an API for expressing data parallel workloads in Java and a runtime component capable of converting the Java bytecode of compatible workloads into OpenCL™ so that it can be executed on a variety of GPU devices.  More information can be found in this blog entry.

AMD OpenCL Coding Contest

June 26th, 2011

AMD announced a GPGPU coding competition, called AMD OpenCL Coding Competition. The first phase of the competition is an open innovation challenge that requires the use of the AMD APP SDK and OpenCL. The competition is heating up with the highest registration for a TopCoder innovation challenge to date. It’s not too late to sign up and show off your ideas! If you submit your abstract before June 30th you will get feedback from AMD, otherwise you will have up until the deadline to submit your OpenCL innovation challenge submission.

Phase two of the competition will be an OpenCL algorithm optimization match that will start later in September. Read more about it in this AMD blog.

AMD Fusion Developer Summit

March 29th, 2011

Heterogeneous computing is moving into the mainstream, and a broader range of applications are already on the way. As the provider of world-class CPUs, GPUs, and APUs, AMD offers unique insight into these technologies and how they interoperate. We’ve been working with industry and academia partners to help advance real-world use of these technologies, and to understand the opportunities that lie ahead. It’s time to share what we’ve learned so far.

With tutorials, hands-on labs, and sessions that span a range of topics from HPC to multimedia, you’ll have the opportunity to expand your view of what heterogeneous computing currently offers and where it is going. You’ll hear from industry innovators and academic pioneers who are exploring different ways of approaching problems, and utilizing new paradigms in computing to help identify solutions. You’ll meet AMD experts with deep knowledge of hardware architectures and the software techniques that best leverage those platforms. And you’ll connect with other software professionals who share your passion for the future of technology.

Learn more at

Page 2 of 41234