SGC Ruby CUDA 0.1.0 Release

May 4th, 2011

SGC Ruby CUDA has been heavily updated. It is now available from the standard Ruby Gems repository. Updates include:

  • Basic CUDA Driver and Runtime API support on CUDA 4.0rc2 with unit tests.
  • Object-Oriented API.
  • Exception classes for CUDA errors.
  • Support for Linux and Mac OSX platforms.
  • Documented with YARD.

See for more details.

GPU Linear Solvers for OpenFOAM

May 4th, 2011

ofgpu is a free GPL library from Symscape that provides GPU linear solvers for OpenFOAM®. The experimental library targets NVIDIA CUDA devices on Windows, Linux, and (untested) Mac OS X. It uses the Cusp library’s Krylov solvers to produce equivalent GPU (CUDA-based) versions of the standard OpenFOAM linear solvers:

  • PCG – Preconditioned conjugate gradient solver for symmetric matrices (e.g., p)
  • PBiCG – Preconditioned biconjugate gradient solver for asymmetric matrices (e.g., Ux, k)

ofgpu also has support for the OpenFOAM preconditioners:

  • no
  • diagonal

For more details see “GPU Linear Solver Library for OpenFOAM”. OpenFOAM is a registered trademark of OpenCFD and is unaffiliated with Symscape.

A memory efficient and fast sparse matrix vector product on a GPU

May 4th, 2011


This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, significantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.

(A. Dziekonski, A. Lamecki, and M. Mrozowski: “A memory efficient and fast sparse matrix vector product on a GPU“, Progress In Electromagnetics Research, Vol. 116, 49-63, 2011. [PDF])

CfP: GPU and Hybrid Computing at PDP2012

May 4th, 2011

A special session on GPU and hybrid computing will be held in conjunction with PDP2012, the 20th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, in February 2012 in Garching, Germany. Submissions are cordially invited including but not limited to the following topics:

  • GPU computing, multi GPU processing, hybrid computing;
  • Programming models, programming frameworks, CUDA, OpenCL, communication libraries;
  • Mechanisms for mapping codes;
  • Task allocation;
  • Fault tolerance;
  • Performance analysis;
  • Applications: image processing, signal processing, linear algebra, numerical simulation, optimization; Domains: computer science, electronic, embedded systems, telecommunication, medical imaging, finance

More information including submission and publication details are available at

KGPU: enabling GPU computing in Linux kernel

May 4th, 2011

KGPU is a GPU computing framework for the Linux kernel. It allows the Linux kernel to directly execute CUDA programs running on GPUs. The motivation is to augment systems with GPUs so that like user-space applications, the operating system itself can benefit from the GPU acceleration. It can also offload computationally intensive work from the CPU by enabling the GPU as an extra computing device.

The current KGPU release includes a demo task with GPU augmentation: a GPU AES cipher based eCryptfs, which is an encrypted file system on Linux. The read /write bandwidths are expected to be accelerated by a factor of 1.7 ~ 2.5 on an NVIDIA GeForce GTX 480 GPU.

The source code can be obtained from, and news and release information can be found at

CfP: Innovative Parallel Computing (INPAR 2011)

April 13th, 2011

We are pleased to announce the 2011 Innovative Parallel Computing: Foundations & Applications of GPU, Manycore, and Heterogeneous Systems (InPar’11). This new conference provides a first-tier academic venue for peer-reviewed publications in the emerging fields of parallel computing, encompassing the topics of GPU computing, manycore computing, and heterogeneous computing.

InPar has dual focus on “Foundations”—the fundamental advances in parallel computing itself—and “Applications”—case studies and lessons learned from the application of commodity parallel computing in domains across science and engineering. The goal of InPar is to bring together researchers in the myriad fields being revolutionized by GPUs to share experiences, discover commonalities, and both inform and learn from the computer scientists working on the foundations of parallel computing.

Topics: InPar encourages papers involving current GPU/manycore architectures, new or emerging commodity parallel architectures (such as Intel “MIC” products), and hybrid or heterogeneous systems. Possible topics include, but are not limited to: Read the rest of this entry »

GID2011 Sumbmission deadline April 22nd

April 13th, 2011

The deadline for submissions to “GPU’s in Databases” GID2011 workshop has been extended [ed: again...] to April 22nd, 2011. The “GPUs in Databases” workshop is devoted to sharing the knowledge related to applying GPUs in database environments and to discuss possible future development of this application domain.  See our previous post for details.

CfP: High performance computational systems biology

April 13th, 2011

The High performance computational systems Biology ( special session of CMSB 2011 ( establishes a forum to link researchers in the areas of parallel computing and computational systems biology. Experts from around the world will present their current work, discuss profound challenges, new ideas, results, applications and their experience relating to key aspects of high performance computing in biology. Topics of interest include: Workload partitioning strategies, Parallel stochastic simulation, Biological and Numerical parallel computing, Parallel and distributed architectures, General-Purpose Computation on Graphics Hardware, Emerging processing architecture (Cell processors, FPGA, PlayStation3, etc.),
Parallel model checking techniques, Parallel parameter estimation, Parallel sensitivity analysis, Parallel algorithms for biological network analysis, Application of concurrency theory to biology, Parallel visualization algorithms, Web-services and Internet computing for e-Science, Grid/Could/P2P/High performance computing for biology, Multicore and Cluster computing for biology, Tools and applications.

The call for papers is now open, please refer to for details.

Call for submissions: EAME

April 6th, 2011

Please consider submitting your work to the 2011 Emerging Applications and Many-core Architectures workshop, colocated with ISCA. Deadline for submissions is April 15th, the workshop takes place on June 4th in San Jose, California, US. For more details refer to the workshop page:

HOOMD-blue 0.9.2 release

April 6th, 2011

HOOMD-blue performs general-purpose particle dynamics simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to many cores on a fast cluster. Flexible and configurable, HOOMD-blue is currently being used for coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants, dissipative particle dynamics simulations (DPD) of polymers, and crystallization of metals.

HOOMD-blue 0.9.2 adds many new features. Highlights include:

  • Long-ranged electrostatics via PPPM
  • Support for CUDA 3.2 and 4.0
  • New neighbor list option to exclude by particle diameter (for pair.slj)
  • New syntax to specify multiple pair coefficients at once
  • Improved documentation
  • Significant performance boosts for small simulations
  • RPM and .deb packaging for CentOS, Fedora, and Ubuntu
  • and more

HOOMD-blue 0.9.2 is available for download under an open source license. Check out the quick start tutorial to get started, or check out the full documentation to see everything it can do.

Page 39 of 112« First...102030...3738394041...506070...Last »