A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

February 11th, 2015


Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper also reviews those techniques which use GPU and FPGA to improve energy efficiency of embedded systems. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow.

Sparsh Mittal, “A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems”, International Journal of Computer Aided Engineering and Technology (IJCAET), vol 6, no. 4, 2014. WWW

MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction

February 11th, 2015


GPUs play an increasingly important role in high-performance computing. While developing naive code is straightforward, optimizing massively parallel applications requires deep understanding of the underlying architecture. The developer must struggle with complex index calculations and manual memory transfers. This article classifies memory access patterns used in most parallel algorithms, based on Berkeley’s Parallel “Dwarfs.” It then proposes the MAPS framework, a device-level memory abstraction that facilitates memory access on GPUs, alleviating complex indexing using on-device containers and iterators. This article presents an implementation of MAPS and shows that its performance is comparable to carefully optimized implementations of real-world applications.

Rubin, Eri, et al. ["MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction."](http://dl.acm.org/citation.cfm?id=2680544) ACM Transactions on Architecture and Code Optimization (TACO) 11.4 (2014): 44.

[Library website](http://www.cs.huji.ac.il/~talbn/maps/)

C Framework for OpenCL v2.0.0 Now Available

February 11th, 2015

After four pre-releases, the stable 2.0.0 version of cf4ocl, the C Framework for OpenCL, is now available.

Since the last beta release, a number of tests were added, and a few bug fixes have been fixed. Support for device fission and native kernels has also been implemented. A complete list of features and fixes is available at https://github.com/FakenMC/cf4ocl/releases.

Cf4ocl has been tested on Linux, OS X and Windows, and offers a pure C object-oriented framework for developing and benchmarking OpenCL projects in C. It aims to:

1. Promote the rapid development of OpenCL host programs in C (with support for C++) and avoid the tedious and error-prone boilerplate code usually required. Read the rest of this entry »

Visualization of Energy Conversion Processes in a Light Harvesting Organelle at Atomic Detail

February 11th, 2015


The cellular process responsible for providing energy for most life on Earth, namely, photosynthetic light-harvesting, requires the cooperation of hundreds of proteins across an organelle, involving length and time scales spanning several orders of magnitude over quantum and classical regimes. Simulation and visualization of this fundamental energy conversion process pose many unique methodological and computational challenges. We present, in an accompanying movie, light-harvesting in the photosynthetic apparatus found in purple bacteria, the so-called chromatophore. The movie is the culmination of three decades of modeling efforts, featuring the collaboration of theoretical, experimental, and computational scientists. We describe the techniques that were used to build, simulate, analyze, and visualize the structures shown in the movie, and we highlight cases where scientific needs spurred the development of new parallel algorithms that efficiently harness GPU accelerators and petascale computers.

Visualization of Energy Conversion Processes in a Light Harvesting Organelle at Atomic Detail. M. Sener, J. E. Stone, A. Barragan, A. Singharoy, I. Teo, K. L. Vandivort, B. Isralewitz, B. Liu, B. Goh, J. C. Phillips, L. F. Kourkoutis, C. N. Hunter, and K. Schulten. SC’14 Visualization and Data Analytics Showcase, 2014. Paper PDF

CfP: High-Performance Graphics 2015: August 7–9

February 10th, 2015

High Performance Graphics is the leading international forum for performance-oriented graphics and imaging systems research, including innovative algorithms, efficient implementations, languages, parallelism, compilers, hardware and architectures for high-performance graphics. The conference brings together researchers, engineers, and architects to discuss the complex interactions of parallel hardware, novel programming models, and efficient algorithms in the design of systems for current and future graphics and visual computing applications.

High Performance Graphics is co-located with SIGGRAPH 2015 in Los Angeles, United States, and will take place on August 7–9, 2015.

More information

A Survey Of Techniques for Managing and Leveraging Caches in GPUs

February 10th, 2015


Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU-GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey several architectural and system-level techniques proposed for managing and leveraging GPU caches. We also discuss the importance and challenges of cache management in GPUs. The aim of this paper is to provide the readers insights into cache management techniques for GPUs and motivate them to propose even better techniques for leveraging the full potential of caches in the GPUs of tomorrow.

Sparsh Mittal, “A Survey Of Techniques for Managing and Leveraging Caches in GPUs”, Journal of Circuits, Systems, and Computers (JCSC), vol. 23, no. 8, 2014. WWW

A Survey of Methods for Analyzing and Improving GPU Energy Efficiency

February 10th, 2015


Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architect highly energy-efficient GPUs of tomorrow.

Sparsh Mittal, Jeffrey S Vetter, “A Survey of Methods for Analyzing and Improving GPU Energy Efficiency”, in ACM Computing Surveys, vol. 47, no. 2, pp. 19:1-19:23, 2014. [WWW]

Boost.Compute v0.4 Released

December 27th, 2014

Boost.Compute is an open-source, header-only C++ library for GPGPU and parallel-computing based on OpenCL. It provides a low-level C++ wrapper over OpenCL and high-level STL-like API with containers and algorithms for the GPU. Boost.Compute is available on GitHub and its documentation can be found here. See the full announcement here: http://kylelutz.blogspot.com/2014/12/boost-compute-0.4-released.html

Real-time Deblocked GPU rendering of Compressed Volume Data

December 2nd, 2014


The wide majority of current state-of-the-art compressed GPU volume renderers are based on block-transform coding, which is susceptible to blocking artifacts, particularly at low bit-rates. In this paper the authors address the problem for the first time, by introducing a specialized deferred filtering architecture working on block-compressed data and including a novel deblocking algorithm. The architecture efficiently performs high quality shading of massive datasets by closely coordinating visibility- and resolution-aware adaptive data loading with GPU-accelerated per-frame data decompression, deblocking, and rendering. A thorough evaluation including quantitative and qualitative measures demonstrates the performance of our approach on large static and dynamic datasets including a massive 512^4 turbulence simulation (256GB), which is aggressively compressed to less than 2 GB, so as to fully upload it on graphics board and to explore it in real-time during animation.

(Fabio Marton, José Antonio Iglesias Guitián, Jose Díaz and Enrico Gobbetti: “Real-time deblocked GPU rendering of compressed volumes”. Proc. 19th International Workshop on Vision, Modeling and Visualization (VMV), pp. 167-174, Oct. 2014. [WWW])

CfP: 23rd High Performance Computing Symposium (HPC’15)

November 14th, 2014

The 23rd High Performance Computing Symposium (HPC’15) is held in conjunction with the SCS Spring Simulation Multiconference (SpringSim’15), April 12-15, 2015, in Alexandria, VA, USA.

Topics of interest include:

  • High performance/large scale application case studies
  • GPU for general purpose computations (GPGPU)
  • Multicore and many-core computing
  • Power aware computing
  • Cloud, distributed, and grid computing
  • Asynchronous numerical methods and programming
  • Hybrid system modeling and simulation
  • Large scale visualization and data management
  • Tools and environments for coupling parallel codes
  • Parallel algorithms and architectures
  • High performance software tools
  • Resilience at the simulation level
  • Component technologies for high performance computing

More information: http://hosting.cs.vt.edu/hpc2015.

Page 1 of 11012345...102030...Last »