Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units

August 20th, 2011

Abstract:

Molecular dynamics (MD) methods compute the trajectory of a system of point particles in response to a potential function by numerically integrating Newton’s equations of motion. Extending these basic methods with rigid body constraints enables composite particles with complex shapes such as anisotropic nanoparticles, grains, molecules, and rigid proteins to be modeled. Rigid body constraints are added to the GPU-accelerated MD package, HOOMD-blue, version 0.10.0. The software can now simulate systems of particles, rigid bodies, or mixed systems in microcanonical (NVE), canonical (NVT), and isothermalisobaric (NPT) ensembles. It can also apply the FIRE energy minimization technique to these systems. In this paper, we detail the massively parallel scheme that implements these algorithms and discuss how our design is tuned for the maximum possible performance. Two different case studies are included to demonstrate the performance attained, patchy spheres and tethered nanorods. In typical cases, HOOMD-blue on a single GTX 480 executes 2.5–3.6 times faster than LAMMPS executing the same simulation on any number of CPU cores in parallel. Simulations with rigid bodies may now be run with larger systems and for longer time scales on a single workstation than was previously even possible on large clusters.

(Trung Dac Nguyen, Carolyn L. Phillips, Joshua A. Anderson, and Sharon C. Glotzer: “Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units”, Computer Physics Communications 182(11):2307–2313, November 2011. [DOI])

Solving ordinary differential equations with CUDA

August 8th, 2011

Odeint is a high level C++ library for solving ordinary differential equations. It is released under an open-source license and supports a variety of different methods for solving ODEs. As a special feature it supports different algebras which perform the basic mathematical operations. This allows the user to solve ordinary differential equations on modern graphic cards. A Thrust interface is implemented, so that the power of CUDA can easily be employed. Furthermore, arbitrary precision types can easily be supported.  Read the rest of this entry »

Intel announces a high-performance SPMD compiler for the CPU

June 26th, 2011

Intel has announced ispc, The Intel SPMD Program Compiler, now available in source and binary form from http://ispc.github.com.

ispc is a new compiler for “single program, multiple data” (SPMD) programs; the same model that is used for (GP)GPU programming, but here targeted to CPUs. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs; it frequently provides a a 3x or more speedup on CPUs with 4-wide SSE units, without any of the difficulty of writing intrinsics code. There were a few principles and goals behind the design of ispc:

  • To build a small C-like language that would deliver excellent performance to performance-oriented programmers who want to run SPMD programs on the CPU.
  • To provide a thin abstraction layer between the programmer and the hardware—in particular, to have an execution and data model where the programmer can cleanly reason about the mapping of their source program to compiled assembly language and the underlying hardware.
  • To make it possible to harness the computational power of the SIMD vector units without the extremely low-programmer-productivity activity of directly writing intrinsics.
  • To explore opportunities from close coupling between C/C++ application code and SPMD ispc code running on the same processor—to have lightweight function calls between the two languages, to share data directly via pointers without copying or reformatting, and so forth.

ispc is an open source compiler with a BSD license. It uses the LLVM Compiler Infrastructure for back-end code generation and optimization and is hosted on github. It supports Windows, Mac, and Linux, with both x86 and x86-64 targets. It currently supports the SSE2 and SSE4 instruction sets, though support for AVX should be available soon.

GPIUTMD 0.9.6 released

June 26th, 2011

GPIUTMD stands for Graphic Processors at Isfahan University of Technology for Many-particle Dynamics. It performs general-purpose many-particle dynamic simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to thousands of cores on a fast cluster. Flexible and configurable, GPIUTMD is currently being used for all atom and coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants; dissipative particle dynamics simulations (DPD) of polymers; and crystallization of metals using EAM potentials. GPIUTMD 0.9.6 adds many new features. Highlights include:

  • Morse bond potential
  • Adding constant acceleration to a group of particles. (useful for modeling gravity effects)
  • Computes the full virial stress tensor (useful in mechanical characterization of materials)
  • Long-ranged electrostatics via PPPM
  • Support for CUDA 3.2
  • Theory manual
  • Up to twenty percent boost in simulations
  • and more

A demo version of GPIUTMD 0.9.6 will be available soon for download under an open source license. Check out the quick start tutorial to get started, or check out the full documentation to see everything it can do.

 

OpenCL Parallel Primitives Library

June 3rd, 2011

clpp is an OpenCL library of data-parallel algorithm primitives such as parallel prefix sum (“scan”), parallel sort and parallel reduction. Primitives such as these are important building blocks for a wide variety of data-parallel algorithms, including sorting, stream compaction, and building data structures such as trees and summed-area tables. For more information, visit http://code.google.com/p/clpp.

Alenka – SQL for CUDA

May 11th, 2011

Alenka is a columnar SQL-like language for data processing on CUDA hardware. Alenka uses vector based processing to perform SQL operations like joins, groups and sorts. The program is capable of processing very large data sets that do not fit into GPU or host memory: such sets are partitioned into pieces and processed separately. Get it here: https://sourceforge.net/projects/alenka/files/

KGPU: enabling GPU computing in Linux kernel

May 4th, 2011

KGPU is a GPU computing framework for the Linux kernel. It allows the Linux kernel to directly execute CUDA programs running on GPUs. The motivation is to augment systems with GPUs so that like user-space applications, the operating system itself can benefit from the GPU acceleration. It can also offload computationally intensive work from the CPU by enabling the GPU as an extra computing device.

The current KGPU release includes a demo task with GPU augmentation: a GPU AES cipher based eCryptfs, which is an encrypted file system on Linux. The read /write bandwidths are expected to be accelerated by a factor of 1.7 ~ 2.5 on an NVIDIA GeForce GTX 480 GPU.

The source code can be obtained from https://github.com/wbsun/kgpu, and news and release information can be found at http://code.google.com/p/kgpu/.

HOOMD-blue 0.9.2 release

April 6th, 2011

HOOMD-blue performs general-purpose particle dynamics simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to many cores on a fast cluster. Flexible and configurable, HOOMD-blue is currently being used for coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants, dissipative particle dynamics simulations (DPD) of polymers, and crystallization of metals.

HOOMD-blue 0.9.2 adds many new features. Highlights include:

  • Long-ranged electrostatics via PPPM
  • Support for CUDA 3.2 and 4.0
  • New neighbor list option to exclude by particle diameter (for pair.slj)
  • New syntax to specify multiple pair coefficients at once
  • Improved documentation
  • Significant performance boosts for small simulations
  • RPM and .deb packaging for CentOS, Fedora, and Ubuntu
  • and more

HOOMD-blue 0.9.2 is available for download under an open source license. Check out the quick start tutorial to get started, or check out the full documentation to see everything it can do.

High Throughput Parallel Molecular Dynamics for GPUs

April 6th, 2011

The North Carolina Renaissance Computing Institute (RENCI) is running Amber PMEMD on the Open Science Grid, the high throughput computing (HTC) fabric used by the Large Hadron Collider (LHC). This approach is likely to be helpful to researchers with any of these challenges:

  1. Constrained by limited computing resources including access to GPGPUs
  2. Manually executing the same simulation repeatedly with different parameters
  3. Making simulations easier to understand, share, scale and re-use across compute resources

For more information see these two blog posts: High Throughput Parallel Molecular Dynamics and CUDA/Tesla Accelerated PMEMD on OSG. Contact Steve Cox (scox@renci.org) if you’d like to discuss further and determine if your application is a fit. If it is, RENCI can provide access to the grid as well as tools for executing and managing simulations.

TRNG-A library for parallel Monte Carlo on NVIDIA graphics cards

January 23rd, 2011

Tina’s Random Number Generator Library (TRNG) version 4.11 has been released. TRNG is a state of the art open-source C++ pseudo-random number generator library for sequential and parallel Monte Carlo simulations. Its design principles are based on a proposal for an extensible random number generator facility that will be part of the forthcoming revision of the ISO C++ standard. The TRNG library features an object oriented design, is easy to use and has been speed optimized. Its implementation does not depend on any communication library or hardware architecture. TRNG is suited for shared memory as well as for distributed memory computers and may be used in various parallel programming environments, e.g. Message Passing Interface Standard or OpenMP. As an outstanding new feature of the latest TRNG release 4.11 it also supports CUDA. All generators that are implemented by TRNG have been subjected to thorough statistical tests in sequential and parallel setups. Download and further information: http://trng.berlios.de/

Page 2 of 612345...Last »