March 9th, 2010
March 1st, 2010
Swan is a small tool that aids the reversible conversion of existing CUDA codebases to OpenCL. Its main features are the translation of CUDA kernel source-code to OpenCL, and a common API that abstracts both CUDA and OpenCL runtimes. Swan preserves the convenience of the CUDA <<< grid, block >>> kernel launch syntax by generating C source-code for kernel entry-point functions. Possible uses include:
- Evaluating OpenCL performance of an existing CUDA code
- Maintaining a dual-target OpenCL and CUDA code
- Reducing dependence on NVCC when compiling host code
- Support multiple CUDA compute capabilities in a single binary
Swan is developed by the MultiscaleLab, Barcelona, and is available under the GPL2 license.
February 21st, 2010
GMAC (Global Memory for ACcelerators) is a user-level library that implements an Asymmetric Distributed Shared Memory model to be used by CUDA programs. An ADSM model allows CPU code to access data hosted in accelerator (GPU) memory. In this model, a single pointer is used for data structures accessed both in the CPU and the GPU and the coherency of the data is transparently handled by the library. Moreover, the data allocated with GMAC can be accessed by all the host threads of the program. That makes your code simpler and cleaner. GMAC currently supports programs programmed with CUDA, but OpenCL support is planned.
A paper describing the Asymmetric Distributed Shared Memory model and its implementation in GMAC has been accepted in the ASPLOS XV conference. GMAC is being developed by the Operating System Group at the Universitat Politecnica de Catalunya and the IMPACT Research Group at the University of Illinois. Binary pre-compiled packages, the source code, documentation and examples are available at the project website.
(Isaac Gelado, Javier Cabezas, John Stone, Sanjay Patel, Nacho Navarro and Wen-mei Hwu, “An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems”, accepted in: Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010), March 2010.)
February 21st, 2010
AccelerEyes has recently launched a number of resources to assist the gpu computing community in general and MATLAB users more specifically:
- In collaboration with Dr. Torben Larsen at Aalborg University in Denmark, Accelereyes has launched Torben’s Corner that consists of a wide variety of tips and tricks for application development and performance benchmarking of GPUs.
- The entire team at AccelerEyes is contributing to a weekly blog on GPU computing with MATLAB. Some recent posts include:
- Using Parallel For Loops (parfor) with MATLAB and Jacket
- Lazy Execution in MATLAB GPU computing
Join the AccelerEyes GPU computing blog for weekly insights to maximizing productivity with GPUs.
February 14th, 2010
Graphic Remedy is proud to announce the release of gDEBugger Version 5.5 for Windows, Linux, Mac OS X and iPhone.
This version introduces a powerful AMD GPU performance counters integration, displaying AMD graphic hardware and driver performance counters in gDEBugger’s Performance Graph and Performance Dashboard views, allowing developers to optimize their application over AMD (ATI) graphics hardware.
AMD Performance counters are available on Windows, when using ATI Radeon (TM) HD 2000 series or newer with Catalyst (TM) 9.12 or newer.
This version also includes a large number of bug fixes and stability improvements.
Read the rest of this entry »
February 11th, 2010
OpenNL (Open Numerical Library) is a library for solving sparse linear systems, especially designed for the Computer Graphics community. The goal of OpenNL is to be as small as possible, while offering the subset of functionalities required by this application field. The Makefiles of OpenNL can generate a single .c and .h file that make it very easy to integrate into other projects. The distribution includes an implementation of a Least Squares Conformal Maps parameterization method. The new version 3.0 of OpenNL includes support for CUDA (with Concurrent Number Cruncher and CUSP ELL formats).
February 10th, 2010
The developers of the CUDPP (CUDA Data-Parallel Primitives) Library request that users (past and current) of the CUDPP Library fill out the CUDPP Survey. This survey will help the CUDPP Team prioritize new development and support for existing and new features.
February 9th, 2010
Graphic Remedy is proud to announce the upcoming release of gDEBugger for OpenCL on Windows, Mac OS X and Linux. This new product will bring gDEBugger’s advanced Debugging, Profiling and Memory Analysis abilities to the OpenCL developer’s world, helping OpenCL developers find bugs and optimize parallel computing application performance and memory consumption.
To join the Free Beta Program, see screenshots and more details, please visit http://www.gremedy.com/gDEBuggerCL.php.
gDEBugger CL enables OpenCL developers to:
February 2nd, 2010
The first textbook of its kind, Programming Massively Parallel Processors: A Hands-on Approach launches today, authored by Dr. David B. Kirk, NVIDIA Fellow and former chief scientist, and Dr. Wen-mei Hwu, who serves at the University of Illinois at Urbana-Champaign as Chair of Electrical and Computer Engineering in the Coordinated Science Laboratory, co-director of the Universal Parallel Computing Research Center and principal investigator of the CUDA Center of Excellence. The textbook, which is 256 pages, is the first aimed at teaching advanced students and professionals the basic concepts of parallel programming and GPU architectures. Published by Morgan-Kauffman, it explores various techniques for constructing parallel programs and reviews numerous case studies.
With conventional CPU-based computing no longer scaling in performance and the world’s computational challenges increasing in complexity, the need for massively parallel processing has never been greater. GPUs have hundreds of cores capable of delivering transformative performance increases across a wide range of computational challenges. The rise of these multi-core architectures has raised the need to teach advanced programmers a new and essential skill: how to program massively parallel processors.
Among the book’s key features:
- First and only text that teaches how to program within a massively parallel environment
- Portions of the NVIDIA-provided content have been part of the curriculum at 300 universities worldwide
- Drafts of sections of the book have been tested and taught by Kirk at the University of Illinois
- Book utilizes OpenCL and CUDA C, the NVIDIA parallel computing language developed specifically for massively parallel environments
Programming Massively Parallel Processors: A Hands-on Approach is available to purchase from Amazon or directly from Elsevier.
January 26th, 2010
We present HONEI, an open-source collection of libraries offering a hardware oriented approach to numerical calculations. HONEI abstracts the hardware, and applications written on top of HONEI can be executed on a wide range of computer architectures such as CPUs, GPUs and the Cell processor. We demonstrate the flexibility and performance of our approach with two test applications, a Finite Element multigrid solver for the Poisson problem and a robust and fast simulation of shallow water waves. By linking against HONEI’s libraries, we achieve a two-fold speedup over straight forward C++ code using HONEI’s SSE backend, and additional 3–4 and 4–16 times faster execution on the Cell and a GPU. A second important aspect of our approach is that the full performance capabilities of the hardware under consideration can be exploited by adding optimised application-specific operations to the HONEI libraries. HONEI provides all necessary infrastructure for development and evaluation of such kernels, significantly simplifying their development.
(Danny van Dyk, Markus Geveler, Sven Mallach, Dirk Ribbrock, Dominik Göddeke and Carsten Gutwenger: HONEI: A collection of libraries for numerical computations targeting multiple processor architectures. Computer Physics Communications 180(12), pp. 2534-2543, December 2009. DOI 10.1016/j.cpc.2009.04.018)
From the release notes:
ATI Stream SDK 2.0 is the first production SDK for both AMD GPUs and x86 CPUs. This release supports a wide range of ATI graphics processors, including the new ATI Radeon HD 5970, and provides support for OpenCL ICD (Installable Client Driver), atomic functions for 32-bit integers, a Microsoft Visual Studio 2008-integrated ATI Stream Profiler performance analysis tool, and other robust features. Preview support for upcoming features include OpenCL and Microsoft DirectX 10 interoperability, and double-precision floating point basic arithmetic in OpenCL C kernels.
Page 20 of 38« First«...10...1819202122...30...»Last »