January 22nd, 2009
January 22nd, 2009
What do GPUs, FPGAs, vector processors and other special-purpose chips have in common? They are examples of advanced processor architectures that the scientific community is using to accelerate computationally demanding applications. While high-performance computing systems that use application accelerators are still rare, they will be the norm rather than the exception in the near future. The 2009 Symposium on Application Accelerators in High-Performance Computing aims to bring together developers of computing accelerators and end-users of the technology to exchange ideas and learn about the latest developments in the field. The Symposium will focus on the use of application accelerators in high-performance and scientific computing and issues that surround it. Topics of interest include:
- novel accelerator processors, systems, and architectures
- integration of accelerators with high-performance computing systems
- programming models for accelerator-based computing
- languages and compilers for accelerator-based computing
- run-time environments, profiling and debugging tools for accelerator-based computing
- scientific and engineering applications that use application accelerators
Presentations from technology developers and the academic user community are invited. Researchers interested in presenting at the Symposium should submit extended abstracts of 2-3 pages to firstname.lastname@example.org by April 20, 2009. All submissions will be reviewed by the Technical Program Committee and accepted submissions will be presented as either oral presentations or posters. Presentation materials will be made available online at www.saahpc.org.
(2009 Symposium on Application Accelerators in High Performance Computing (SAAHPC’09). July 27-31, 2009, University of Illinois, Urbana, IL)
January 22nd, 2009
Graphic Remedy is proud to announce the upcoming release of gDEBugger for Mac OS X. This new product brings all of gDEBugger’s Debugging and Profiling abilities to the Mac OpenGL developer’s world. Using gDEBugger Mac will help OS X OpenGL developers optimize their application performance: find graphics pipeline bottlenecks, improve application graphics memory consumption, locate and remove redundant OpenGL calls and graphics memory leaks, and much more. Visit the gDebuggerMac home page to join the Beta Program, see screenshots and get more details.
gDEBugger, an OpenGL and OpenGL ES debugger and profiler, traces application activity on top of the OpenGL API, and lets programmers see what is happening within the graphics system implementation to find bugs and optimize OpenGL application performance. gDEBugger runs on Windows, Linux and Mac OS X operating systems.
January 11th, 2009
This workshop, to be held at TU Delft on Friday January 30, 2009, presents state-of-the-art performance results for engineering applications on parallel machines, based on either the Cell Processor or on GPUs. Next to iterative solvers, finite element applications, tomography and visualization applications, some background information on computation on these platforms and coupling of processors will be shown. To attend this workshop is free, registration is required. (Workshop: Experience with the GPU and the Cell Processor)
December 23rd, 2008
This workshop will focus on compilation techniques for exploiting parallelism in emerging massively multi-threaded and multi-core architectures, with a particular focus on the use of general-purpose GPU computing techniques to overcome traditional barriers to parallelization. Recently, GPUs have evolved to address programming of general-purpose computations, especially those exemplified by data-parallel models. This change will have long-term implications for languages, compilers, and programming models. Development of higher-level programming languages, models and compilers that exploit such processors will be important. Clearly, the economics and performance of applications is affected by a transition to general-purpose GPU computing. This will require new ideas and directions as well as recasting some older techniques to the new paradigm.
EPHAM 2009 invites papers in this emerging discipline which include, but are not limited, to the following areas of interest.
- Static and dynamic parallelization for hybrid CPU/GPU systems
- Compiler optimizations for GPU computing
- Language constructs and extensions to enable parallel programming with GPUs
- Run-time techniques to off-load computation to the GPU
- Language, programming model, or compiler techniques for mapping irregular computations to GPUs
- Debugging support for GPU programs
- Performance analysis tools related to GPU computing
- Other hardware-assisted methods for extracting and exploiting parallelism
Please find more information at the EPHAM 2009 workshop website.
December 23rd, 2008
The complete course notes from the “Parallel Computing for Graphics: Beyond Programmable Shading” SIGGRAPH Asia 2008 course , are available online. The course gives an introduction to parallel programming architectures and environments for interactive graphics and explores case studies of combining traditional rendering API usage with advanced parallel computation from game developers, researchers, and graphics hardware vendors. There are strong indications that the future of interactive graphics involves a programming model more flexible than today’s OpenGL and Direct3D pipelines. As such, graphics developers need a basic understanding of how to combine emerging parallel programming techniques with the traditional interactive rendering pipeline. This course gives an introduction to several parallel graphics architectures and programming environments, and introduces the new types of graphics algorithms that will be possible. The case studies in the class discuss the mix of parallel programming constructs used, details of the graphics algorithms, and how the rendering pipeline and computation interact to achieve the technical goals. The course speakers are Jason Yang and Justin Hensley (AMD), Tim Foley (Intel), Mark Harris (NVIDIA), Kun Zhou (Zhejiang University), Anjul Patney (UC Davis), Pedro Sander (HKUIST), and Christopher Oat (AMD) (Complete course notes.)
December 11th, 2008
DECEMBER 19, 2008- NVIDIA has announced the availability of version 2.1 beta of its CUDA toolkit and SDK. This is the latest version of the C-compiler and software development tools for accessing the massively parallel CUDA compute architecture of NVIDIA GPUs. In response to overwhelming demand from the developer community, this latest version of the CUDA software suite includes support for NVIDIA®® Tesla™ GPUs on Windows Vista and 32-bit debugger support for CUDA on RedHat Enterprise Linux 5.x (separate download).
The CUDA Toolkit and SDK 2.1 beta includes support for VisualStudio 2008 support on Windows XP and Vista and Just-In-Time (JIT) compilation for applications that dynamically generate CUDA kernels. Several new interoperability APIs have been added for Direct3D 9 and Direct3D 10 that accelerate communication to DirectX applications as well as a series of improvements to OpenGL interoperability.
CUDA Toolkit and SDK 2.1 beta also features support for using a GPU that is not driving a display on Vista, a beta of Linux Profiler 1.1 (separate download) as well as support for recent releases of Linux including Fedora9, OpenSUSE 11 and Ubuntu 8.04.
CUDA Toolkit and SDK 2.1 beta is available today for free download from www.nvidia.com/object/cuda_get.
December 11th, 2008
Equalizer Graphics have announced the release of Equalizer 0.6, a major advance in parallel OpenGL rendering. Equalizer is middleware for creating parallel OpenGL-based applications, including GPGPU applications. It enables applications to benefit from multiple graphics cards, processors and computers to scale rendering performance, visual quality and display size. Equalizer 0.6 adds support for Automatic load-balancing for 2D and DB decompositions, DPlex (time-multiplex) compounds, and Paracomp compositing backend. See the release notes on the Equalizer website for a comprehensive list of new features, enhancements, optimizations and bug fixes.
December 11th, 2008
This paper aims at bridging the gap between the lack of synchronization mechanisms in recent graphics processor (GPU) architectures and the need of synchronization mechanisms in parallel applications. Based on the intrinsic features of recent GPU architectures, the authors construct strong synchronization objects like wait-free and t-resilient read-modify-write objects for a general model of recent GPU architectures without strong hardware synchronization primitives like test-and-set and compare-and-swap. Accesses to the new wait-free objects have time complexity O(N), where N is the number of concurrent processes. The wait-free objects have space complexity O(N^2), which is optimal. Our result demonstrates that it is possible to construct wait-free synchronization mechanisms for GPUs without the need of strong synchronization primitives in hardware and that wait-free programming is possible for GPUs.
(Wait-free programming for general purpose computations on graphics processors. Phuong Hoai Ha, Philippas Tsigas, and Otto J. Anshus. ACM Symposium on Principles of Distributed Computing, 2008.)
December 11th, 2008
The complete course notes from the “Beyond Programmable Shading” SIGGRAPH 2008 course , are available online. The course gives an introduction to parallel programming architectures and environments for interactive graphics and explores case studies of combining traditional rendering API usage with advanced parallel computation from game developers, researchers, and graphics hardware vendors. There are strong indications that the future of interactive graphics involves a programming model more flexible than today’s OpenGL and Direct3D pipelines. As such, graphics developers need a basic understanding of how to combine emerging parallel programming techniques with the traditional interactive rendering pipeline. This course gives an introduction to several parallel graphics architectures and programming environments, and introduces the new types of graphics algorithms that will be possible. The case studies in the class discuss the mix of parallel programming constructs used, details of the graphics algorithms, and how the rendering pipeline and computation interact to achieve the technical goals. The course organizers are Aaron Lefohn (Intel) and Mike Houston (AMD). Additional course speakers include Kayvon Fatahalian (Stanford), David Luebke (NVIDIA), Tom Forsyth (Intel), John Owens (UC Davis), Chas Boyd (Microsoft), Aaftab Munshi (Apple), Fabio Pellacini (Dartmouth), Jon Olick (Id Software), Matt Pharr (Intel), and Jeremy Shopf (AMD). (Complete course notes)
As the computing power of various platforms intended for games and similar applications is increasing rapidly, they attract the interest of professionals in the HPC community. As an example, modern graphics processing units (GPUs) are often used for HPC in GPGPU. Another example is the Cell Broadband Engine of the Playstation3 (PS3) that has a multicore architecture that lends itself for HPC. These platforms are not conventional HPC platforms; nonetheless they are used for HPC purposes and even clusters of such computing resources are being built with great success. Both the computing power and the low cost compared to conventional HPC resources make them very interesting. The aim of this workshop is to focus on such unconventional resources for HPC. Only imagination sets the limit for the kinds of devices that can be used for HPC end even be combined to form clusters. (UCHPC ’09 Website, Call for Papers)