ATI Stream Profiler v1.3 Released

May 20th, 2010

Advanced Micro Devices (AMD) recently released ATI Stream Profiler version 1.3. ATI Stream Profiler is a Microsoft® Visual Studio® integrated runtime profiler that gathers performance data from the GPU as your OpenCL™ application runs. This information can then be used by developers to discover where the bottlenecks are in their OpenCL™ application and find ways to optimize their application’s performance.

Features of the tool include:

  • Measure the execution time of an OpenCL kernel
  • Query the hardware performance counters on ATI Radeon graphics cards
  • Display the memory traffic from and to GPU
  • Compare multiple runs (sessions) of the same or different programs
  • Store the profile data for each run in a csv file
  • Display the IL and ISA (hardware disassembly) code of the OpenCL kernel

Scalable HeterOgeneous Computing (SHOC) Benchmark Suite

May 4th, 2010

The Scalable Heterogeneous Computing Benchmark Suite (SHOC) is a collection of benchmark programs testing the performance and stability of systems using computing devices with non-traditional architectures for general-purpose computing, and the software used to program them. Its initial focus is on systems containing Graphics Processing Units (GPUs) and multi-core processors, and on the OpenCL programming standard. It can be used on clusters as well as individual hosts.

(Danalis, A., Marin, G., McCurdy, C., Meredith, J., Roth, P., Spafford, K., Tipparaju, V., Vetter, J. (2010). The Scalable HeterOgeneous Computing (SHOC) Benchmark Suite.Proceedings of the Third Workshop on General-Purpose Computation on Graphics Processors (GPGPU 2010)PDF. Mar 2010.)

CLyther 0.1 Beta Released

April 25th, 2010

GeoSpin has released the first version of CLyther for beta testing. Please visit the CLyther SourceForge website for more information.  CLyther enables developers to seamlessly write GPGPU code completely in python with no additional syntax. CLyther’s core driver contains a python compiler to convert Python functions and types to OpenCL during runtime.

CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL such as:

  • OpenCL interface similar to PyOpenCL
  • Dynamic compilation of OpenCL code at runtime
  • Fast prototyping of OpenCL code
  • Create OpenCL code using the Python language definition
  • Passing functions as arguments to OpenCL kernels
  • Pure Python emulation mode of kernel functions

Read the rest of this entry »

OpenCL Studio 1.0 beta released

April 5th, 2010

Geist Software Labs has released the first version of OpenCL Studio for beta testing. OpenCL Studio combines OpenCL and OpenGL into a single integrated development environment that allows you to visualize OpenCL computation using powerful 3D rendering techniques. The editor hides much of the complexity of the underlying APIs while still providing flexibility via the Lua scripting language. Integrated source code editors and debugging capabilities for OpenCL, GLSL, and Lua, as well as a toolbox of 2D user interface widgets provide a framework for a wide range of parallel programming solutions.

Accelerating MATLAB Image Processing Toolbox Functions on GPUs

March 23rd, 2010

Abstract:

We present our effort in developing an open-source GPU (graphics processing units) code library for the MATLAB Image Processing Toolbox (IPT). We ported a dozen of representative functions from IPT and based on their inherent characteristics, we grouped these functions into four categories: data independent, data sharing, algorithm dependent and data dependent. For each category, we present a detailed case study, which reveals interesting insights on how to efficiently optimize the code for GPUs and highlight performance-critical hardware features, some of which have not been well explored in existing literature. Our results show drastic speedups for the functions in the data-independent or data-sharing category by leveraging hardware support judiciously; and moderate speedups for those in the algorithm-dependent category by careful algorithm selection and parallelization. For the functions in the last category, fine-grain synchronization and data-dependency requirements are the main obstacles to an efficient implementation on GPUs.

(J. Kong, et. al., “Accelerating MATLAB Image Processing Toolbox Functions on GPUs”, Proceedings of the Third Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU-3), Pittsburgh, PA. Apr. 2010. Source code is available here.)

CUDA 3.0 toolkit released

March 20th, 2010

NVIDIA has released version 3.0 of the CUDA Toolkit, providing developers with tools to prepare for the upcoming Fermi-based GPUs. Highlights of this release include:

  • Support for the new Fermi architecture, with:
    • Native 64-bit GPU support
    • Multiple Copy Engine support
    • ECC reporting
    • Concurrent Kernel Execution
    • Fermi HW debugging support in cuda-gdb
    • Fermi HW profiling support for CUDA C and OpenCL in Visual Profiler
  • C++ Class Inheritance and Template Inheritance support for increased programmer productivity
  • A new unified interoperability API for Direct3D and OpenGL, with support for:
    • OpenGL texture interop
    • Direct3D 11 interop support
    • CUDA Driver / Runtime Buffer Interoperability, which allows applications using the CUDA Driver API to also use libraries implemented using the CUDA C Runtime such as CUFFT and CUBLAS.
  • Read the rest of this entry »

CLyther = Python + OpenCL

March 9th, 2010

CLyther is an under-development python tool for OpenCL similar to Cython for C. CLyther is a python language extension intended to make writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features for OpenCL.

CLyther exposes both the OpenCL C library and language to python. It’s features include:

  • Fast prototyping of OpenCL code.
  • OpenCL kernel function creation using the Python language definition.
  • Strong OOP programming in OpenCL code.
  • Passing functions as arguments to kernel functions.
  • Python emulation mode for OpenCL code.
  • Fancy indexing of arrays.
  • Dynamic compilation at runtime.

Read the rest of this entry »

Swan: A simple tool for porting CUDA to OpenCL

March 9th, 2010

Swan is a small tool that aids the reversible conversion of existing CUDA codebases to OpenCL. Its main features are the translation of CUDA kernel source-code to OpenCL, and a common API that abstracts both CUDA and OpenCL runtimes. Swan preserves the convenience of the CUDA <<< grid, block >>> kernel launch syntax by generating C source-code for kernel entry-point functions. Possible uses include:

  • Evaluating OpenCL performance of an existing CUDA code
  • Maintaining a dual-target OpenCL and CUDA code
  • Reducing dependence on NVCC when compiling host code
  • Support multiple CUDA compute capabilities in a single binary

Swan is developed by the MultiscaleLab, Barcelona, and is available under the GPL2 license.

gDEBugger for OpenCL – Beta Program

February 10th, 2010

Graphic Remedy is proud to announce the upcoming release of gDEBugger for OpenCL on Windows, Mac OS X and Linux. This new product will bring gDEBugger’s advanced Debugging, Profiling and Memory Analysis abilities to the OpenCL developer’s world, helping OpenCL developers find bugs and optimize parallel computing application performance and memory consumption.

To join the Free Beta Program, see screenshots and more details, please visit http://www.gremedy.com/gDEBuggerCL.php.

gDEBugger CL enables OpenCL developers to:

Programming Massively Parallel Processors: A Hands-on Approach

February 9th, 2010

Programming Massively Parallel Processors Cover ImageThe first textbook of its kind, Programming Massively Parallel Processors: A Hands-on Approach launches today, authored by Dr. David B. Kirk, NVIDIA Fellow and former chief scientist, and Dr. Wen-mei Hwu, who serves at the University of Illinois at Urbana-Champaign as Chair of Electrical and Computer Engineering in the Coordinated Science Laboratory, co-director of the Universal Parallel Computing Research Center and principal investigator of the CUDA Center of Excellence. The textbook, which is 256 pages, is the first aimed at teaching advanced students and professionals the basic concepts of parallel programming and GPU architectures. Published by Morgan-Kauffman, it explores various techniques for constructing parallel programs and reviews numerous case studies.

With conventional CPU-based computing no longer scaling in performance and the world’s computational challenges increasing in complexity, the need for massively parallel processing has never been greater. GPUs have hundreds of cores capable of delivering transformative performance increases across a wide range of computational challenges. The rise of these multi-core architectures has raised the need to teach advanced programmers a new and essential skill: how to program massively parallel processors.

Among the book’s key features:

  • First and only text that teaches how to program within a massively parallel environment
  • Portions of the NVIDIA-provided content have been part of the curriculum at 300 universities worldwide
  • Drafts of sections of the book have been tested and taught by Kirk at the University of Illinois
  • Book utilizes OpenCL and CUDA C, the NVIDIA parallel computing language developed specifically for massively parallel environments

Programming Massively Parallel Processors: A Hands-on Approach is available to purchase from Amazon or directly from Elsevier.

Page 4 of 7« First...23456...Last »