CfP: Minisymposium on GPU Computing at PPAM (Warsaw, Sep 8-11, 2013)

January 31st, 2013

GPU programming is now a much richer environment that it used to be a few years ago. On top of the two major programming languages, CUDA and OpenCL, libraries (e.g., cufft) and high level interfaces (e.g., thrust) have been developed that allow a fast access to the computing power of GPUs without detailed knowledge or programming of GPU hardware.

Annotation-based programming models (e.g., OpenACC), GPU plug-ins for existing mathematical software (e.g., Jacket in Matlab), GPU script languages (e.g., PyOpenCL), and new data parallel languages (e.g., Copperhead) bring GPU programming to a new level.

A major criticism of programming abstractions is that they look great on small examples but fail on practical problems. Therefore, this symposium invites, in particular, submissions that deal with practical applications that have successfully employed GPU libraries or high level programming tools. The focus may lie both on the development of the libraries or utilization of existing tools. Workshop topics include, but are not limited to:

  • GPU applications coded with high level programming tools
  • GPU library development and application
  • Comparison of different programming abstractions on the same/similar applications
  • Comparison of the same/similar programming abstractions on different applications
  • Performance and coding effort of high level tools against hand-coded approaches on the GPU
  • Performance and coding effort on multi-core CPUs against GPUs utilizing programming abstractions
  • Classification of different programming abstractions with respect to their best application area

The highest quality papers of the minisymposium will receive an invitation to a special issue of the journal “Concurrency and Computation: Practice and Experience”.

Full CFP: Minisymposium on GPU Computing at the 10th International Conference on Parallel Processing and Applied Mathematics (PPAM). Note that PPAM will also host a full-day tutorial on Advanced GPU Programming.

A scalable, numerically stable, high-performance tridiagonal solver using GPUs

January 29th, 2013

Abstract:

In this paper, we present a scalable, numerically stable, high-performance tridiagonal solver. The solver is based on the SPIKE algorithm for partitioning a large matrix into small independent matrices, which can be solved in parallel. For each small matrix, our solver applies a general 1-by-1 or 2-by-2 diagonal pivoting algorithm, which is also known to be numerically stable. Our paper makes two major contributions. First, our solver is the first numerically stable tridiagonal solver for GPUs. Our solver provides comparable quality of stable solutions to Intel MKL and Matlab, at speed comparable to the GPU tridiagonal solvers in existing packages like CUSPARSE. It is also scalable to multiple GPUs and CPUs. Second, we present and analyze two key optimization strategies for our solver: a high-throughput data layout transformation for memory efficiency, and a dynamic tiling approach for reducing the memory access footprint caused by branch divergence.

(Chang, Li-Wen and Stratton, John A. and Kim, Hee-Seok and Hwu, Wen-mei W.: “A scalable, numerically stable, high-performance tridiagonal solver using GPUs”, Supercomputing 2012. [WWW])

Parallel Computing Training Dates from AccelerEyes

January 29th, 2013

AccelerEyes has released dates for their upcoming CUDA and OpenCL training courses.

CUDA

OpenCL

More information can be found on the courses’ webpages.

Acceleware parallel programming courses

January 25th, 2013

Acceleware has recently announced four courses on parallel programming:

More information is available on the courses’ webpages.

CfP: MUSEPAT 2013

January 14th, 2013

The International Conference on Multicore Software Engineering, Performance, and Tools (MUSEPAT) is a forum for researchers and practitioners that face the multicore and distributed software challenge, addressing the full software development life-cycle of concurrent systems – software specification and design, programing models and techniques, testing, analysis, and debugging. The conference welcomes original, previously unpublished regular, and industrial papers, as well as tool presentations.

Abstracts are due 5 March 2013 and full papers 12 March 2013. The symposium will be 19–20 August 2013 in Saint Petersburg, Russia. More information is available at http://eventos.fct.unl.pt/musepat2013.

A Multi-GPU Programming Library for Real-Time Applications

January 11th, 2013

Abstract:

We present MGPU, a C++ programming library targeted at single-node multi-GPU systems. Such systems combine disproportionate floating point performance with high data locality and are thus well suited to implement real-time algorithms. We describe the library design, programming interface and implementation details in light of this specific problem domain. The core concepts of this work are a novel kind of container abstraction and MPI-like communication methods for intra-system communication. We further demonstrate how MGPU is used as a framework for porting existing GPU libraries to multi-device architectures. Putting our library to the test, we accelerate an iterative non-linear image reconstruction algorithm for real-time magnetic resonance imaging using multiple GPUs. We achieve a speed-up of about 1.7 using 2 GPUs and reach a final speed-up of 2.1 with 4 GPUs. These promising results lead us to conclude that multi-GPU systems are a viable solution for real-time MRI reconstruction as well as signal-processing applications in general.

(Sebastian Schaetz and Martin Uecker: “A Multi-GPU Programming Library for Real-Time Applications”,  Algorithms and Architectures for Parallel Processing (2012): 114-128. [DOI] [ARXIV])

High Performance Graphics 2013 Call for Participation

January 9th, 2013

High Performance Graphics is the leading international forum for performance-oriented graphics systems research including innovative algorithms, efficient implementations, and hardware architecture. The conference brings together researchers, engineers, and architects to discuss the complex interactions of parallel hardware, novel programming models, and efficient algorithms in the design of systems for current and future graphics and visual computing applications. The program features three days of paper and industry presentations, with ample time for discussions during breaks, lunches, and the conference banquet. The conference is co-located with SIGGRAPH 2013 in Anaheim, California, and will take place on July 19-21, 2013 (the weekend before SIGGRAPH). More information, calls for papers and posters and submission instructions are available at http://www.highperformancegraphics.org.

Finite Element Matrix Generation on a GPU

January 6th, 2013

Abstract:

This paper presents an efficient technique for fast generation of sparse systems of linear equations arising in computational electromagnetics in a finite element method using higher order elements. The proposed approach employs a graphics processing unit (GPU) for both numerical integration and matrix assembly. The performance results obtained on a test platform consisting of a Fermi GPU (1x Tesla C2075) and a CPU (2x twelve-core Opterons), indicate that the GPU implementation of the matrix generation allows one to achieve speedups by a factor of 81 and 19 over the optimized single-and multi-threaded CPU-only implementations, respectively.

(Adam Dziekonski et al., “Finite Element Matrix Generation on a GPU”, Progress In Electromagnetics Research 128:249-265, 2012. [PDF])

Lattice microbes: High-performance stochastic simulation method for the reaction-diffusion master equation

January 6th, 2013

Abstract:

Spatial stochastic simulation is a valuable technique for studying reactions in biological systems. With the availability of high-performance computing (HPC), the method is poised to allow integration of data from structural, single-molecule and biochemical studies into coherent computational models of cells. Here, we introduce the Lattice Microbes software package for simulating such cell models on HPC systems. The software performs either well-stirred or spatially resolved stochastic simulations with approximated cytoplasmic crowding in a fast and efficient manner. Our new algorithm efficiently samples the reaction-diffusion master equation using NVIDIA graphics processing units and is shown to be two orders of magnitude faster than exact sampling for large systems while maintaining an accuracy of ∼0.1%. Display of cell models and animation of reaction trajectories involving millions of molecules is facilitated using a plug-in to the popular VMD visualization platform. The Lattice Microbes software is open source and available for download at http://www.scs.illinois.edu/schulten/lm

(Elijah Roberts, John E. Stone and Zaida Luthey-Schulten: “Lattice Microbes: High-Performance Stochastic Simulation Method for the Reaction-Diffusion Master Equation”, Journal of Computational Chemistry, 34:245-255, 2013. [DOI])

amgcl: an accelerated algebraic multigrid for C++

December 21st, 2012

amgcl is a simple and generic algebraic multigrid (AMG) hierarchy builder. Supported coarsening methods are classical Ruge-Stuben coarsening, and either plain or smoothed aggregation. The constructed hierarchy is stored and used with help of one of the supported backends including VexCL, ViennaCL, and CUSPARSE/Thrust.

With help of amgcl, solution of a large sparse system of linear equations may be easily accelerated through OpenCL, CUDA, or OpenMP technologies. Source code of the library is publicly available under MIT license at https://github.com/ddemidov/amgcl.

Page 11 of 105« First...910111213...203040...Last »