HOOMD-blue 0.10.0 release

December 19th, 2011

HOOMD-blue performs general-purpose particle dynamics simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to many cores on a fast cluster. Flexible and configurable, HOOMD-blue is currently being used for coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants, dissipative particle dynamics simulations (DPD) of polymers, and crystallization of metals.

HOOMD-blue 0.10.0 adds many new features. Highlights include: Read the rest of this entry »

Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units

August 20th, 2011

Abstract:

Molecular dynamics (MD) methods compute the trajectory of a system of point particles in response to a potential function by numerically integrating Newton’s equations of motion. Extending these basic methods with rigid body constraints enables composite particles with complex shapes such as anisotropic nanoparticles, grains, molecules, and rigid proteins to be modeled. Rigid body constraints are added to the GPU-accelerated MD package, HOOMD-blue, version 0.10.0. The software can now simulate systems of particles, rigid bodies, or mixed systems in microcanonical (NVE), canonical (NVT), and isothermalisobaric (NPT) ensembles. It can also apply the FIRE energy minimization technique to these systems. In this paper, we detail the massively parallel scheme that implements these algorithms and discuss how our design is tuned for the maximum possible performance. Two different case studies are included to demonstrate the performance attained, patchy spheres and tethered nanorods. In typical cases, HOOMD-blue on a single GTX 480 executes 2.5–3.6 times faster than LAMMPS executing the same simulation on any number of CPU cores in parallel. Simulations with rigid bodies may now be run with larger systems and for longer time scales on a single workstation than was previously even possible on large clusters.

(Trung Dac Nguyen, Carolyn L. Phillips, Joshua A. Anderson, and Sharon C. Glotzer: “Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units”, Computer Physics Communications 182(11):2307–2313, November 2011. [DOI])

Pseudo-random number generation for Brownian Dynamics and Dissipative Particle Dynamics simulations on GPU devices

August 20th, 2011

Abstract:

Brownian Dynamics (BD), also known as Langevin Dynamics, and Dissipative Particle Dynamics (DPD) are implicit solvent methods commonly used in models of soft matter and biomolecular systems. The interaction of the numerous solvent particles with larger particles is coarse-grained as a Langevin thermostat is applied to individual particles or to particle pairs. The Langevin thermostat requires a pseudo-random number generator (PRNG) to generate the stochastic force applied to each particle or pair of neighboring particles during each time step in the integration of Newton’s equations of motion. In a Single-Instruction-Multiple-Thread (SIMT) GPU parallel computing environment, small batches of random numbers must be generated over thousands of threads and millions of kernel calls. In this communication we introduce a one-PRNG-per-kernel-call-per-thread scheme, in which a micro-stream of pseudorandom numbers is generated in each thread and kernel call. These high quality, statistically robust micro-streams require no global memory for state storage, are more computationally efficient than other PRNG schemes in memory-bound kernels, and uniquely enable the DPD simulation method without requiring communication between threads.

(Carolyn L. Phillips, Joshua A. Anderson and Sharon C. Glotzer: “Dynamics and Dissipative Particle Dynamics simulations on GPU devices”, Journal of Computational Physics 230(19):7191-7201, August 2011. [DOI])

Scalable Fast Multipole Methods for Heterogeneous CPU-GPU architectures

August 17th, 2011

Abstract:

We fundamentally reconsider implementation of the Fast Multipole Method (FMM) on a computing node with a heterogeneous CPU-GPU architecture with multicore CPU(s) and one or more GPU accelerators, as well as on an interconnected cluster of such nodes. The FMM is a divide-and-conquer algorithm that performs a fast N-body sum using a spatial decomposition and is often used in a time-stepping or iterative loop. Using the observation that the local summation and the analysis-based translation parts of the FMM are independent, we map these respectively to the GPUs and CPUs. Careful analysis of the FMM is performed to distribute work optimally between the multicore CPUs and the GPU accelerators. We first develop a single node version where the CPU part is parallelized using OpenMP and the GPU version via CUDA. New parallel algorithms for creating FMM data structures are presented together with load balancing strategies for the single node and distributed multiple-node versions. Our 8 GPU performance
is comparable with performance of a 256 GPU version of the FMM that won the 2009 Bell prize.

(Qi Hu, Nail A. Gumerov and Ramani Duraswami: “Scalable fast multipole methods on distributed heterogeneous architectures”, accepted for SC’11. [PDF])

Molecular Dynamics Simulations Using Graphics Processing Units

August 15th, 2011

Abstract

It is increasingly easy to develop software that exploits Graphics Processing Units (GPUs). The molecular dynamics simulation community has embraced this recent opportunity. Herein, we outline the current approaches that exploit this technology. In the context of biomolecular simulations, we discuss some of the algorithms that have been implemented and some of the aspects that distinguish the GPU from previous parallel environments. The ubiquity of GPUs and the ingenuity of the simulation community augur well for the scale and scope of future computational studies of biomolecules.

(Baker, J. A. and Hirst, J. D.: “Molecular Dynamics Simulations Using Graphics Processing Units”. Molecular Informatics, 30:498–504. [DOI])

GPIUTMD 0.9.6 released

June 26th, 2011

GPIUTMD stands for Graphic Processors at Isfahan University of Technology for Many-particle Dynamics. It performs general-purpose many-particle dynamic simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to thousands of cores on a fast cluster. Flexible and configurable, GPIUTMD is currently being used for all atom and coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants; dissipative particle dynamics simulations (DPD) of polymers; and crystallization of metals using EAM potentials. GPIUTMD 0.9.6 adds many new features. Highlights include:

  • Morse bond potential
  • Adding constant acceleration to a group of particles. (useful for modeling gravity effects)
  • Computes the full virial stress tensor (useful in mechanical characterization of materials)
  • Long-ranged electrostatics via PPPM
  • Support for CUDA 3.2
  • Theory manual
  • Up to twenty percent boost in simulations
  • and more

A demo version of GPIUTMD 0.9.6 will be available soon for download under an open source license. Check out the quick start tutorial to get started, or check out the full documentation to see everything it can do.

 

Fast analysis of molecular dynamics trajectories with graphics processing units—Radial distribution function histogramming

June 14th, 2011

Abstract:

The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple graphics processing units (GPUs). The algorithm features a tiling scheme to maximize the reuse of data at the fastest levels of the GPU’s memory hierarchy and dynamic load balancing to allow high performance on heterogeneous configurations of GPUs. Several versions of the RDF algorithm are presented, utilizing the specific hardware features found on different generations of GPUs. We take advantage of larger shared memory and atomic memory operations available on state-of-the-art GPUs to accelerate the code significantly. The use of atomic memory operations allows the fast, limited-capacity on-chip memory to be used much more efficiently, resulting in a fivefold increase in performance compared to the version of the algorithm without atomic operations. The ultimate version of the algorithm running in parallel on four NVIDIA GeForce GTX 480 (Fermi) GPUs was found to be 92 times faster than a multithreaded implementation running on an Intel Xeon 5550 CPU. On this multi-GPU hardware, the RDF between two selections of 1,000,000 atoms each can be calculated in 26.9 s per frame. The multi-GPU RDF algorithms described here are implemented in VMD, a widely used and freely available software package for molecular dynamics visualization and analysis.

(Benjamin G. Levine, John E. Stone, and Axel Kohlmeyer: “Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units — Radial Distribution Function Histogramming”, Journal of Computational Physics, 230(9):3556-3569, 2011. [DOI: 10.1016/j.jcp.2011.01.048])

HOOMD-blue 0.9.2 release

April 6th, 2011

HOOMD-blue performs general-purpose particle dynamics simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to many cores on a fast cluster. Flexible and configurable, HOOMD-blue is currently being used for coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants, dissipative particle dynamics simulations (DPD) of polymers, and crystallization of metals.

HOOMD-blue 0.9.2 adds many new features. Highlights include:

  • Long-ranged electrostatics via PPPM
  • Support for CUDA 3.2 and 4.0
  • New neighbor list option to exclude by particle diameter (for pair.slj)
  • New syntax to specify multiple pair coefficients at once
  • Improved documentation
  • Significant performance boosts for small simulations
  • RPM and .deb packaging for CentOS, Fedora, and Ubuntu
  • and more

HOOMD-blue 0.9.2 is available for download under an open source license. Check out the quick start tutorial to get started, or check out the full documentation to see everything it can do.

High Throughput Parallel Molecular Dynamics for GPUs

April 6th, 2011

The North Carolina Renaissance Computing Institute (RENCI) is running Amber PMEMD on the Open Science Grid, the high throughput computing (HTC) fabric used by the Large Hadron Collider (LHC). This approach is likely to be helpful to researchers with any of these challenges:

  1. Constrained by limited computing resources including access to GPGPUs
  2. Manually executing the same simulation repeatedly with different parameters
  3. Making simulations easier to understand, share, scale and re-use across compute resources

For more information see these two blog posts: High Throughput Parallel Molecular Dynamics and CUDA/Tesla Accelerated PMEMD on OSG. Contact Steve Cox (scox@renci.org) if you’d like to discuss further and determine if your application is a fit. If it is, RENCI can provide access to the grid as well as tools for executing and managing simulations.

Real-space calculation of powder diffraction patterns on graphics processing units.

March 29th, 2011

Abstract:

Diffraction, particularly of X-rays, is a powerful technique for the investigation of structure, microstructure and dynamical properties of matter. In order to link theoretical methods, like Molecular Dynamics and other atomistic approaches, and diffraction experiments we developed a new software for calculating the powder diffraction pattern of nano-sized objects on the GPUs. The software, soon to be made available under GPL license, allows the use of GPUs on different hosts for a direct (brute-force) computation of the Debye scattering equation.

(L Geliso, C. L. Azanza Ricardo, M. Leoni and P. Scardi: “Real-space calculation of powder diffraction patterns on graphics processing units”, Journal of Applied Crystallography 43:647-653, 2010. [DOI])

Page 1 of 3123