April 21st, 2012
April 21st, 2012
Libra Platform is a GPGPU-Heterogeneous Compute API and runtime environment available on Windows, Mac and Linux. Libra Compute API offers performance portability and direct compute access via standard programming environments C/C++, Java, C# and Matlab to execute math operations on top of current and future compute architectures, including the latest GPUs, x86/x64 CPUs and with broad support for compute devices compatible with low level specific APIs – OpenCL, CUDA, OpenGL and standard x86/x64 compute APIs.
Read more in the full announcement.
April 18th, 2012
A 2 day CUDA workshop is taking place in Berlin, Germany on May 5 and 6 2012. Course details, outline and prices are available at http://cuda.eventbrite.com.
April 17th, 2012
The rCUDA Team is proud to announce a new version of the rCUDA framework which will include many new functionalities as well as boosted performance. This new version, cooked for over a year, will incorporate pipelined transfers, full multi-thread and multi-node capabilities, CUDA 4.1 support, global scheduler integration, support for CUDA C extensions, and native InfiniBand support. A closed beta teting program has been started. See the complete text at http://www.rcuda.net/index.php/news/19-new-revolutionary-version-of-rcuda-to-be-launched.html.
April 17th, 2012
Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both irregular and data-dependent. Recent work has demonstrated the plausibility of GPU sparse graph traversal, but has tended to focus on asymptotically inefficient algorithms that perform poorly on graphs with non-trivial diameter.
We present a BFS parallelization focused on fine-grained task management constructed from efficient prefix sum that achieves an asymptotically optimal O(|V|+|E|) work complexity. Our implementation delivers excellent performance on diverse graphs, achieving traversal rates in excess of 3.3 billion and 8.3 billion traversed edges per second using single and quad-GPU configurations, respectively. This level of performance is several times faster than state-of-the-art implementations both CPU and GPU platforms.
(Duane Merrill, Michael Garland and Andrew Grimshaw: “Scalable GPU graph traversal”, Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP’12), pp.117-128, Feburary 2012. [DOI])
April 10th, 2012
Partnering with NVIDIA, this four day course (May 8-11, 2012) is designed for Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the multi-core processing capabilities of the GPU.
Delivered by Acceleware Developers, who provide real world experience and examples, the training comprises of classroom lectures and hands-on tutorials. Each student will be supplied with a laptop equipped with NVIDIA GPUs for the duration of the course. Small class sizes maximize learning and ensure a personal educational experience.
More information: http://www.acceleware.com/may8calgary
April 1st, 2012
UKPEW is the leading UK forum for the presentation of all aspects of performance modeling and analysis of computer and telecommunication systems. Original papers are invited on all relevant topics but papers on or related to the subjects listed below are particularly welcome.
The paper submission deadline has just been extended to April 20, 2012. The conference takes place June 2 and 3, 2012, in Edinburgh, UK. More Information: http://www.ukpew.org
April 1st, 2012
Accelerate your science on the Titan Supercomputer later this year, by harnessing up to 20 petaflops of parallel processing using GPUs. Open to researchers from academia, government labs, and industry, the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program is the major means by which the scientific community gains access to some of the fastest supercomputers.
First, let INCITE know you are interested in GPU acceleration by completing a two-minute survey. Then determine if you want to submit a formal proposal by June 27, 2012.
Need help drafting your proposal? Attend a “how-to” webinar on Tuesday, April 24 to learn tips and tricks for drafting your proposal. For further questions about the call for proposals, please contact the INCITE manager at INCITE@DOEleadershipcomputing.org.
March 30th, 2012
We present a new adaptive format for storing sparse matrices on GPU. We compare it with several other formats including CUSPARSE which is today probably the best choice for processing of sparse matrices on GPU in CUDA. Contrary to CUSPARSE which works with common CSR format, our new format requires conversion. However, multiplication of sparse-matrix and vector is significantly faster for many matrices. We demonstrate it on a set of 1600 matrices and we show for what types of matrices our format is profitable.
(Heller M., Oberhuber T., “Adaptive Row-Grouped CSR Format For Storing of Sparse Matrices on GPU“, preprint on Arxiv.org 2012, [PDF])
March 18th, 2012
AMD offers an OpenCL Programming Webinar Series to help software developers become experts in the latest technologies, standards and best practices. The series of three OpenCL webinars will be presented by Rob Farber.
1. April 10th, 10AM PDT: Introducing Portable Parallelism
- C and C++ APIs
- OpenCL Memory Spaces
- The OpenCL Execution Model
2. April 24th, 10AM PDT: Coordinating OpenCL Computations on one more Heterogeneous Devices
- How to Concisley Utilize Multiple Command Queues and Coordinate Tasks Across Multiple Heterogeneous Devices such as two GPU + CPU
- Code Sample Discussion: Massively Parallel Random Number Test Framework
3. May 1st, 10AM PDT: Accelerate Rendering by an Order of Magnitude with OpenCL, Plus a View to the Multi-core and Web-enabled Future
- How to use OpenCL to Provide High-Quality, Fast Rendering in Combination with Primitive Restart
- Device Fission, Partitioning Hardware Capabilities for Optimal Resource Usage
- Looking to the Future – WebCL
Registration is limited. More Information: http://developer.amd.com/zones/OpenCLZone/Events/pages/OpenCLWebinars.aspx
You are cordially invited to attend the Accelerating Computational Science Symposium 2012 (ACSS). This symposium is designed to advance the understanding of hybrid-computing architectures and how they are accelerating progress in scientific research.
Hosted by Oak Ridge Leadership Computing Facility (OLCF), along with the National Center for Supercomputing Applications (NCSA) and the Swiss National Supercomputing Centre (CSCS), the symposium takes place March 29-30, 2012 in Washington DC.
The complete agenda and additional information about the symposium is available at http://www.olcf.ornl.gov/event/accelerating-computational-science-symposium-2012-acss-2012/.
Read the rest of this entry »