This webinar explores the memory model of the GPU and the memory enhancements available in the Kepler architecture, and how these will affect performance optimization. The webinar begins with an essential overview of GPU architecture and thread cooperation before focusing on the different memory types available on the GPU. We define shared, constant and global memory and discuss the best locations to store your application data for optimized performance. The shuffle instruction, new shared memory configurations and Read-Only Data Cache of the Kepler architecture are introduced and optimization techniques discussed. Click here to view the webinar recording.
Webinar: How to Improve Performance using the CUDA Memory Model and Features of the Kepler ArchitectureDecember 20th, 2013
PCO 2014, the fourth Workshop on Parallel Computing and Optimization, will be held in conjunction with Symposium IEEE IPDPS, Phoenix, USA, May 23, 2014. The workshop on Parallel Computing and Optimization aims at providing a forum for scientific researchers and engineers on recent advances in the field of parallel or distributed computing for difficult combinatorial optimization problems, like 0-1 multidimensional knapsack problems and cutting stock problems, large scale linear programming problems, nonlinear optimization problems and global optimization problems. Emphasis will be placed on new techniques for the solution of these difficult problems like cooperative methods for integer programming problems and polynomial optimization methods. Aspects related to Combinatorial Scientific Computing (CSC) will also be treated. Finally, the use of new approaches in parallel computing like GPU or hybrid computing, peer to peer computing and cloud computing will be considered. Application to planning, logistics, manufacturing, finance, telecommunications and computational biology will be considered.
G-BLASTN is a GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST. G-BLASTN can produce exactly the same results as NCBI-BLAST, and it also has very similar user commands. It also supports a pipeline mode, which can fully use the GPU and CPU resources when handling a batch of medium to large sized queries. Currently, G_BLASTN supports the blastn and megablast modes of NCBI-BLAST. The discontiguous megablast mode is not supported yet. More information: http://www.comp.hkbu.edu.hk/~chxw/software/G-BLASTN.html
The Virtual School of Computational Science and Engineering is hosting two upcoming webinars.
- Introduction to HOOMD-blue, December 10, 2013, 11:00 EST.
- Using HOOMD-blue for Polymer Simulations and Big Systems, January 21, 2014, 11:00 EST.
More information and registration: http://www.vscse.org/
IWOCL (“eye-wok-ul”) is an annual meeting of developers, researchers and suppliers to promote the use, evolution and advancement of the OpenCL parallel programming open standard. IWOCL 2014 will take place in Bristol, England on May 12-13, 2014. For additional information visit http://www.iwocl.org
VexCL is a modern C++ library created for ease of GPGPU development with C++. VexCL strives to reduce the amount of boilerplate code needed to develop GPGPU applications. The library provides a convenient and intuitive notation for vector arithmetic, reduction, sparse matrix-vector multiplication, etc. The source code is available under the permissive MIT license. As of v1.0.0, VexCL provides two backends: OpenCL and CUDA. Users may choose either of those at compile time with a preprocessor macro definition. More information is available at the GitHub project page and release notes page.
GPGPU-7 (Seventh Workshop on General Purpose Processing Using GPUs) is held in conjunction with ASPLOS in Salt Lake City, Utah, on March 1st, 2014. The goal of this workshop is to provide a forum to discuss new and emerging general-purpose purpose programming environments and platforms, as well as evaluate applications that have been able to harness the horsepower provided by these platforms.
This year’s work is particularly interested on new heterogeneous GPU platforms. Read the rest of this entry »
AMD CodeXL is a free set of tools for GPU debugging, GPU profiling, static analysis of OpenCL kernels, and CPU profiling, including support for remote servers. For more information and download links, see: http://developer.amd.com/community/blog/2013/11/08/codexl-1-3-released/
Bolt is an STL compatible C++ template library for creating data-parallel applications using C++ (no C++ AMP / OpenCL code required). For more information about the Bolt template library and download links, see: http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/bolt-c-template-library/
AMD APP SDK has everything needed to get started with OpenCL and parallel programming. It includes OpenCL samples that are very easy to compile, as well as the Bolt and other libraries. For more information about AMD APP SDK and download links, see: http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/
Allinea DDT is part of Allinea Software’s unified tools platform, which provides a single powerful and intuitive environment for debugging and profiling of parallel and multithreaded applications. It is widely used by computational scientists and scientific programmers to fix software defects of parallel applications running on hybrid GPU clusters and supercomputers. DDT 4.1.1 supports CUDA 5.5, C++11 and the GNU 4.8 compilers. Also introduced with Allinea DDT 4.1.1 is CUDA toolkit debugging support for ARMv7 architectures. More information: http://www.allinea.com
The Libra 3.0 Heterogeneous Cloud Computing SDK has recently been released by GPU Systems. It supports PC, Tablet and Mobile Devices and includes a new virtualizing function for cloud compute services of local and remote CPUs and GPUs. C/C++, Java, C# and Matlab are supported. Read the full press release here.