Hot-Rodding Windows and Linux App Performance with CUDA-Based Plugins

February 28th, 2012

This Dr. Dobb’s Article by Rob Farber provides a tutorial on creating application plugins to accelerate Windows and Linux application performance using CUDA in dynamically loaded libraries.

Adding GPU capabilities to existing Windows and Linux apps can be done simply using plugins and the built-in support found in CUDA. This easy form of dynamic loading enables CUDA to be used selectively to hugely accelerate individual tasks within a larger application.

CUDA is maturing to become a natural extension of the emerging CPU/GPU paradigm of high-speed computing to make it, and GPU computing, a candidate for all application development. A recent article in this series tutorial series, Running CUDA Code Natively on x86 Processors, noted recent developments that allow CUDA programs to transparently compile and run on x86 processors. This article focuses on incorporating CUDA into Windows and Linux workflows by exploiting the capabilities of the NVIDIA compiler driver, nvcc, to create native runtime loadable plugins. Source code is provided to create and utilize CUDA plugins and even dynamically compile and link a CUDA source file into a running application (just like the OpenCL). Read the rest of this entry »

Acceleware OpenCL™ Training in NYC

February 28th, 2012

Developed in partnership with AMD, this four day course is designed for GPU Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the multi-core processing capabilities of the GPU.

Delivered by Acceleware’s Developers, who provide real world experience and examples, the training comprises classroom lectures and hands-on tutorials. Each student will be supplied with a laptop equipped with an AMD Fusion APU for the duration of the course. Small class sizes maximize learning and ensure a personal educational experience. Read the rest of this entry »

SpeedIT 2.0 released

February 24th, 2012

SpeedIT 2.0 and the SpeedIT plugin to OpenFOAM have been released. New features include:

  • One of the fastest Sparse Matrix Vector Multiplication worldwide.
  • Faster Conjugate Gradient and BiConjugate Gradient solvers.
  • State-of-the-art CMRS format for storing sparse matrices. The format requires less memory than CRS or HYB (from CUSPARSE and CUSP).
  • Faster acceleration in OpenFOAM (Computational Fluid Dynamics).

More information is available at http://speed-it.vratis.com.

Acceleware CUDA™ Training in Houston, TX – Oil & Gas Focused

February 21st, 2012

Partnering with NVIDIA and Microsoft, this four-day CUDA training course is designed for GPU Programmers in the oil-and-gas industry who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the many-core processing capabilities of the GPU.

Acceleware CUDA™ Training in Moutain View, CA

February 21st, 2012

Partnering with NVIDIA and Microsoft, this four-day CUDA training course is designed for GPU Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the many-core processing capabilities of the GPU.

Chai, a new managed platform for GPGPU

February 13th, 2012

Chai is a new managed platform for GPGPU. It is a free and open source clean room workalike of the PeakStream platform. While not production-ready, the just-released alpha version is able to compile and run non-trivial PeakStream demo code on AMD and NVIDIA GPUs (e.g. conjugate gradient).

Chai combines an application virtual machine, garbage collection, auto-tuning JIT compiler, and high level array programming language implemented as an embedded domain-specific language in C++. The JIT back-end uses expectation-maximization to auto-tune and generate vectorized OpenCL. The JIT includes auto-tuned model families for GEMM and GEMV. Although originally developed for AMD GPUs, these parameterized kernel families also generalize to NVIDIA GPUs.

OpenCL Studio 2.0 released

February 10th, 2012

OpenCL Studio integrates OpenCL and OpenGL into a single development environment for high performance computing. The feature rich editor, interactive scripting language and extensible plug-in architecture support the rapid development of complex parallel algorithms and accompanying visualizations. Version 2.0 now conforms to the Lua plug-in architecture and closely integrates the open-source libCL parallel algorithm library. A complete version of OpenCL Studio is freely available for download at www.opencldev.com, including instructional videos and technology showcases.

VMD 1.9.1 released

February 9th, 2012

VMD is a popular molecular visualization and analysis program used by thousands of researchers worldwide. VMD accelerates many of the most computationally demanding visualization and analysis features using GPU computing techqniques, resulting in improved performance and new capabilities beyond what is possible using only conventional multi-core CPUs. VMD 1.9.1 advances these capabilities further with a CUDA implementation of the new QuickSurf molecular surface representation, enabling smooth interactive animation of moderate sized biomolecular complexes consisting of a few hundred thousand to one million atoms, and allowing interactive display of molecular surfaces for static structures of very large complexes containing tens of millions of atoms, e.g. large virus capsids.

More information: http://www.ks.uiuc.edu/Research/vmd/vmd-1.9.1/

New CLOGS library with sort and scan primitives for OpenCL

February 5th, 2012

CLOGS is a library for higher-level operations on top of the OpenCL C++ API. It is designed to integrate with other OpenCL code, including synchronization using OpenCL events. Currently only two operations are supported: radix sorting and exclusive scan. Radix sort supports all the unsigned integral types as keys, and all the built-in scalar and vector types suitable for storage in buffers as values. Scan supports all the integral types. It also supports vector types, which allows for limited multi-scan capabilities.

Version 1.0 of the library has just been released. The home page is http://clogs.sourceforge.net/

New GPU & HPC meetup group in Pune, India

February 1st, 2012

A new GPU and high-performance computing meetup group has been formed in Pune, India.   The informal special interest group will bring together GPU users from all fields and experience levels in India, including academicians, researchers, scientists, device manufacturers, system integrators, service providers and all early adopters of HPC & GPU computing. The group, hosted on Meetup.com, will provide HPC and GPU computing enthusiasts in India a comprehensive platform to track industry trends and engage with each other, discussing the latest developments in the field.

The group will have a core group of key academicians to lead and moderate discussions. The site will feature a bank of research papers, case studies and posts on the latest GPU-related technological developments. The meetup group will also encourage users to engage and interact over group chats and web conferences.  You can find the group at

http://www.meetup.com/HPC-and-GPU-Computing-Group-India

 

Page 10 of 42« First...89101112...203040...Last »