Scalable Partitioning for Parallel Position Based Dynamics

April 13th, 2015

Abstract:

We introduce a practical partitioning technique designed for parallelizing Position Based Dynamics, and exploiting the ubiquitous multi-core processors present in current commodity GPUs. The input is a set of particles whose dynamics is influenced by spatial constraints. In the initialization phase, we build a graph in which each node corresponds to a constraint and two constraints are connected by an edge if they influence at least one common particle. We introduce a novel greedy algorithm for inserting additional constraints (phantoms) in the graph such that the resulting topology is qˆ-colourable, where qˆ ≥ 2 is an arbitrary number. We color the graph, and the constraints with the same color are assigned to the same partition. Then, the set of constraints belonging to each partition is solved in parallel during the animation phase. We demonstrate this by using our partitioning technique; the performance hit caused by the GPU kernel calls is significantly decreased, leaving unaffected the visual quality, robustness and speed of serial position based dynamics.

(Fratarcangeli M and Pellacini F, Scalable Partitioning for Parallel Position Based Dynamics, Computer Graphics Forum (Special Issue of Eurographics 2015 Conference). Vol. 34(2) 2015)

HUCAA2015 CFP: Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications

April 13th, 2015

4th International Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications (HUCAA 2015)

http://www.hucaa-workshop.org/hucaa2015
Sept. 8-11, 2015 – Chicago, IL, US
In conjunction with IEEE CLUSTER 2015
IEEE International Conference on Cluster Computing

ABOUT THE WORKSHOP

The workshop on Heterogeneous and Unconventional Cluster Architectures
and Applications gears to gather recent work on heterogeneous and
unconventional cluster architectures and applications, which might
have an impact on future mainstream cluster architectures. This
includes any cluster architecture that is not based on the usual
commodity components and therefore makes use of some special hard- or
software elements, or that is used for special and unconventional
applications. Read the rest of this entry »

CFP: 8th Workshop on UnConventional High Performance Computing 2015

April 13th, 2015

Recent issues with the power consumption of conventional HPC hardware results in both new interest in accelerator hardware and in usage of mass-market hardware originally not designed for HPC. The most prominent examples are GPUs, but FPGAs, DSPs and embedded designs are also possible candidates to provide higher power efficiency, as they are used in energy-restriced environments, such as smartphones or tablets. The so-called “dark silicon” forecast, i.e. not all transistors may be active at the same time, may lead to even more specialized hardware in future mass-market products. Exploiting this hardware for HPC can be a worthwhile challenge.

UCHPC is held in conjunction with Euro-Par 2015, August 24/25 in Vienna, Austria. More information,including the full call for papers, submission instructions and important dates: uchpc15.lrr.in.tum.de

OpenCL Training Course in Calgary, AB – May 26, 2015

April 13th, 2015

Acceleware’s next OpenCL course takes place in Calgary. This professional four day course is designed for programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage data parallel processing capabilities of GPUs. Register before May 12 if you would like to reserve a spot. To find out what the course includes visit:
Learn OpenCL in Calgary      www.acceleware.com

CfP: 3rd Workshop on Parallel and Distributed Agent-Based Simulations (PADABS 2015)

April 13th, 2015

The 3rd Workshop on Parallel and Distributed Agent-Based Simulations (PADABS) is a  satellite Workshop of Euro-Par 2015(Vienna, Austria, 24-28 August 2015).

Agent-Based Simulation Models are an increasingly popular tool for research and management in many fields such as ecology, economics, sociology, etc.. In some fields, such as social sciences, these models are seen as a key instrument to the generative approach, essential for understanding complex social phenomena. But also in policy-making, biology, military simulations, control of mobile robots and economics, the relevance and effectiveness of Agent-Based Simulation Models is recently recognized. Read the rest of this entry »

PARALUTION Release 1.0

April 13th, 2015

PARALUTION is a library for sparse iterative methods which can be performed on various parallel devices, including multi-core CPU, GPU (CUDA and OpenCL) and Intel Xeon Phi.

The 1.0 version of the PARALUTION Library supports multi-node and multi-GPU configuration via MPI. All iterative solvers support global operations (i.e. distributed matrices and vectors) and all preconditioners can be used in a block-Jacobi fashion locally on each node/GPU. In addition, the software provides a global (fully distributed) Pair-Wise AMG solver. Read the rest of this entry »

ADBIS workshop on GPUs in Databases GID 2015

April 13th, 2015

The workshop will take place in Poitiers, France on September 8th, 2015,

Aims and Scope

High performance of modern Graphics Processing Units may be utilized not only for graphics related application but also for general computing. This computing power has been utilized in new variants of many algorithms from almost every computer science domain. Unfortunately, while other application domains strongly benefit from utilizing the GPUs, databases related applications seem not to get enough attention. The main goal of GPUs in Databases workshop is to fill this gap. This event is devoted to sharing the knowledge related to applying GPUs in Database environments and to discuss possible future development of this application domain. Read the rest of this entry »

A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

February 11th, 2015

Abstract:

Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper also reviews those techniques which use GPU and FPGA to improve energy efficiency of embedded systems. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow.

Sparsh Mittal, “A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems”, International Journal of Computer Aided Engineering and Technology (IJCAET), vol 6, no. 4, 2014. WWW

MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction

February 11th, 2015

Abstract:

GPUs play an increasingly important role in high-performance computing. While developing naive code is straightforward, optimizing massively parallel applications requires deep understanding of the underlying architecture. The developer must struggle with complex index calculations and manual memory transfers. This article classifies memory access patterns used in most parallel algorithms, based on Berkeley’s Parallel “Dwarfs.” It then proposes the MAPS framework, a device-level memory abstraction that facilitates memory access on GPUs, alleviating complex indexing using on-device containers and iterators. This article presents an implementation of MAPS and shows that its performance is comparable to carefully optimized implementations of real-world applications.

Rubin, Eri, et al. ["MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction."](http://dl.acm.org/citation.cfm?id=2680544) ACM Transactions on Architecture and Code Optimization (TACO) 11.4 (2014): 44.

[Library website](http://www.cs.huji.ac.il/~talbn/maps/)

C Framework for OpenCL v2.0.0 Now Available

February 11th, 2015

After four pre-releases, the stable 2.0.0 version of cf4ocl, the C Framework for OpenCL, is now available.

Since the last beta release, a number of tests were added, and a few bug fixes have been fixed. Support for device fission and native kernels has also been implemented. A complete list of features and fixes is available at https://github.com/FakenMC/cf4ocl/releases.

Cf4ocl has been tested on Linux, OS X and Windows, and offers a pure C object-oriented framework for developing and benchmarking OpenCL projects in C. It aims to:

1. Promote the rapid development of OpenCL host programs in C (with support for C++) and avoid the tedious and error-prone boilerplate code usually required. Read the rest of this entry »

Page 2 of 11212345...102030...Last »