WCCM Minisymposium: Applications and methods of GPU

December 7th, 2011

Since the last WCCM (Sydney 2009), where we organized a similarly themed minisymposium, the scientific and engineering communities have gained much experience in using GPU hardware for their applications. The number of publications addressing GPU applications has skyrocketed, while researchers have developed much common understanding of how to implement numerical methods in this architecture. Moreover, we now find that three of the five fastest computers in the world, as measured for the Top500 list, are GPU-based systems. There is much conversation about GPUs playing a leading role in the exascale computing world. In summary, this topic is of wide interest; frankly, it is all the rage. This minisymposium will concentrate presentations from the top researchers in the world using GPU hardware for applications in all branches of computational mechanics. We encourage contributions that address innovative methods to use GPUs efficiently, studies in numerical methods as they apply to adapting to the hardware and perspectives on the future of GPUs as we advance toward exascale.

WCCM will be held at São Paolo, Brazil, 8–13 July 2012.  The abstract submission deadline is  December 31, 2011. More information: http://www.wccm2012.com, http://barbagroup.bu.edu/Barba_group/Events.html.

CUDA 4.1 RC2 Released

December 6th, 2011

The NVIDIA CUDA Toolkit 4.1 RC2 is now available for anyone to download. The key features of this release are:

  • A new LLVM based compiler
  • Over 1000 additional image processing function in the NPP library
  • A Visual profiler

There is also a new version of Parallel Nsight 2.1 RC2 with support for CUDA 4.1. To download and to find out more follow: http://bit.ly/sRpQvr

Introduction to Generic Accelerated Computing with Libra SDK

November 30th, 2011

Libra SDK is a sophisticated runtime including API, sample programs and documentation for massively accelerating software computations. This introduction tutorial provides an overview and usage examples of the powerful Libra API & math libraries executing on x86/x64, OpenCL, OpenGL and CUDA technology. Libra API enables generic and portable CPU/GPU computing within software development without the need to create multiple, specific and optimized code paths to support x86, OpenCL, OpenGL or CUDA devices. Link to PDF: www.gpusystems.com/doc/LibraGenericComputing.pdf

KOAP: Kentucky OpenCL Application Preprocessor

November 29th, 2011

KOAP, pronounced “cope,” is a tool for developing OpenCL applications. It’s purpose is to allow the programmer to aggregate and simplify calls to the OpenCL API. KOAP accepts as input a file containing (or including) both the OpenCL program and the host C program. KOAP understands several directives, each of which is prefixed with a $ character. When KOAP is run, these directives are replaced with the requisite OpenCL API calls. Programs preprocessed by KOAP can run on any target supported by OpenCL, including both NVIDIA and AMD GPUs.

KOAP is now freely available as a source code tar file from http://aggregate.org/KOAP/.

Alenka – A GPU database engine including compression

November 28th, 2011

Support for several types of compression has been added to the GPU-based database engine ålenkå . Supported algorithms include FOR (frame of reference), FOR-DELTA and dictionary compression. All compression algorithms run on the GPU achieving gigabytes per second compression and decompression speed. The use of compression allows to significantly reduce or eliminate I/O bottlenecks in analytical queries as shown by ålenkå’s results in the Star Schema and TPC-H benchmarks.

CfP: 4th Workshop on using Emerging Parallel Architectures (WEPA)

November 20th, 2011

The 4th Workshop on using Emerging Parallel Architectures (WEPA 2012) is held in conjunction with the International Conference on Computational Science (ICCS 2012), Omaha, Nebraska, June 2-4, 2011.

The computing landscape has undergone significant transformation with the emergence of more powerful processing elements such as GPUs, FPGAs, multi-cores, etc. On the multi-core front, Moore’s Law has transcended beyond the single processor boundary with the prediction that the number of cores will double every 18 months. Going forward, the primary method of gaining processor performance will be through parallelism. Multi-core technology has visibly penetrated the global market. Accordingly to the latest Top500 lists the HPC landscape has evolved from supercomputer systems into large clusters of dual or quad-core processors. Furthermore, GPUs, FPGAs and multi-cores have been shown to be formidable computing alternatives, where certain classes of applications witness more than one order of magnitude improvement over their GPP counterpart. Therefore, future computational science centers will employ resources such as FPGA and GPU architectures to serve as co-processors to offload appropriate compute-intensive portions of applications from the servers. Read the rest of this entry »

GPU Virtualization for Dynamic GPU Provisioning

November 18th, 2011

From a recent press release:

Taipei, November 18, 2011: Zillians, a leading cloud solution provider specializing in high performance computing, GPU virtualization middleware and massive multi-player online game (MMOG) platforms today announced the availability of vGPU – the world’s first commercial virtualization solution for decoupling GPU hardware from software. Traditionally, physical GPUs must reside on the same machine running GPU code. This severely hampers GPU cloud deployment due to the difficulty of dynamic GPU provisioning. With vGPU technology, bulky hardware is no longer a limiting factor. vGPU introduces a thin, transparent RPC layer between local application and remote GPU, enabling existing GPU software to run without any modification on a remote GPU resource. Read the rest of this entry »

Integrating CUDA and GNU Autotools

November 17th, 2011

ClusterChimps.org has released a step by step guide to integrating CUDA with GNU Autotools. The guide covers building stand alone CUDA binaries, static CUDA libraries, shared CUDA libraries and comes with an example tarball. For more information go to http://www.clusterchimps.org/autotools.php

Parallel Accelerating for Star Catalogue Retrieval Algorithm using GPUs

November 16th, 2011


A GPU-based parallel star retrieval method is proposed to improve the efficiency of searching stars from star catalogue in computer simulation, especially when the FOV (Field of View) is large. By the novel algorithm, the stars in catalogue are classified and stored in different zones using latitude and longitude zoning method firstly. Based on the easily accessible star catalogue, the star zones that FOV covers can be computed exactly by constructing a spherical triangle around the FOV. As a result, the searching scope is reduced effectively. Finally, we use CUDA computation architecture to run the process of star retrieving from those star zones parallel on GPU. Experimental results show that, in comparison with CPU-oriented implementation, the proposed algorithm achieves up to tens of times speedup, and the processing time is limited within a millisecond level in large FOV and wide star magnitude span. It meets the requirement of real-time simulation.

(Chao Li, Liqiang Zhang, Jiaze Wu, and Changwen Zheng, “Parallel Accelerating for Star Catalogue Retrieval Algorithm using GPUs”, Journal of Astronautics, 2012)

A fast algorithm of simulating star map for star sensor

November 16th, 2011


In order to test the function and performance of star sensor on the ground, a fast method for simulating star map is presented. The algorithm adopts instantanesous coordinate of star and improves the star searching efficiency by optimizing the zone partitioning method for star catalogue. We overcome the low accuracy of the latitude and longitude’s span that FOV overlays by proposing a new spherical right-angled triangle method and the searching scope is reduced highly; meanwhile, the simulation model for star brightness is also built based on adopted star catalogue. Simulation study is conducted for the demonstration of the algorithm. The proposed approach meets the requirement of wide magnitude range and short simulation period.

(Chao Li, Changwen Zheng, Jiaze Wu, and Liqiang Zhang, “A fast algorithm of simulating star map for star sensor”, Proceedings of the 3rd IEEE International Conferernce on Computer and Network Technology (IEEE ICCNT), 2011)

Page 30 of 112« First...1020...2829303132...405060...Last »