Brownian Dynamics (BD), also known as Langevin Dynamics, and Dissipative Particle Dynamics (DPD) are implicit solvent methods commonly used in models of soft matter and biomolecular systems. The interaction of the numerous solvent particles with larger particles is coarse-grained as a Langevin thermostat is applied to individual particles or to particle pairs. The Langevin thermostat requires a pseudo-random number generator (PRNG) to generate the stochastic force applied to each particle or pair of neighboring particles during each time step in the integration of Newton’s equations of motion. In a Single-Instruction-Multiple-Thread (SIMT) GPU parallel computing environment, small batches of random numbers must be generated over thousands of threads and millions of kernel calls. In this communication we introduce a one-PRNG-per-kernel-call-per-thread scheme, in which a micro-stream of pseudorandom numbers is generated in each thread and kernel call. These high quality, statistically robust micro-streams require no global memory for state storage, are more computationally efficient than other PRNG schemes in memory-bound kernels, and uniquely enable the DPD simulation method without requiring communication between threads.
(Carolyn L. Phillips, Joshua A. Anderson and Sharon C. Glotzer: “Dynamics and Dissipative Particle Dynamics simulations on GPU devices”, Journal of Computational Physics 230(19):7191-7201, August 2011. [DOI])
This new report covers all the performance improvements in the latest CUDA Toolkit 3.2 release, and compares CUDA parallel math library performance vs. commonly used CPU libraries.
Learn about the performance advantages of using the CUDA parallel math libraries for FFT, BLAS, sparse matrix operations, and random number generation.
Tina’s Random Number Generator Library (TRNG) version 4.11 has been released. TRNG is a state of the art open-source C++ pseudo-random number generator library for sequential and parallel Monte Carlo simulations. Its design principles are based on a proposal for an extensible random number generator facility that will be part of the forthcoming revision of the ISO C++ standard. The TRNG library features an object oriented design, is easy to use and has been speed optimized. Its implementation does not depend on any communication library or hardware architecture. TRNG is suited for shared memory as well as for distributed memory computers and may be used in various parallel programming environments, e.g. Message Passing Interface Standard or OpenMP. As an outstanding new feature of the latest TRNG release 4.11 it also supports CUDA. All generators that are implemented by TRNG have been subjected to thorough statistical tests in sequential and parallel setups. Download and further information: http://trng.berlios.de/
Random numbers are extensively used on the GPU. As more computation is ported to the GPU, it can no longer be treated as rendering hardware alone. Random number generators (RNG) are expected to cater general purpose and graphics applications alike. Such diversity adds to expected requirements of a RNG. A good GPU RNG should be able to provide repeatability, random access, multiple independent streams, speed, and random numbers free from detectable statistical bias. A specific application may require some if not all of the above characteristics at one time. In particular, we hypothesize that not all algorithms need the highest-quality random numbers, so a good GPU RNG should provide a speed quality tradeoff that can be tuned for fast low quality or slower high quality random numbers.
We propose that the Tiny Encryption Algorithm satisfies all of the requirements of a good GPU Pseudo Random Number Generator. We compare our technique against previous approaches, and present an evaluation using standard randomness test suites as well as Perlin noise and a Monte-Carlo shadow algorithm. We show that the quality of random number generation directly affects the quality of the noise produced, however, good quality noise can still be produced with a lower quality random number generator.
(Fahad Zafar, Aaron Curtis and Marc Olano, “GPU Random Numbers via the Tiny Encryption Algorithm”, HPG 2010: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on High Performance Graphics, (Saarbrücken, Germany, June 2010. Link to preprint.)
Basic uniform pseudo-random number generators are implemented on ATI Graphics Processing Units (GPU). The performance results of the realized generators (multiplicative linear congruential (GGL), XOR-shift (XOR128), RANECU, RANMAR, RANLUX and Mersenne Twister (MT19937)) on CPU and GPU are discussed. The obtained speed-up factor is hundreds of times in comparison with CPU. RANLUX generator is found to be the most appropriate for using on GPU in Monte Carlo simulations. The brief review of the pseudo-random number generators used in modern software packages for Monte Carlo simulations in high-energy physics is present.
(Vadim Demchik, “Pseudo-random number generators for Monte Carlo simulations on Graphics Processing Units”, Mar. 2010, arXiv:1003.1898 [hep-lat])
MTGP is a new variant of the Mersenne Twister (MT) pseudorandom number generator introduced by Mutsuo Saito and Makoto Matsumoto in 2009. MTGP is designed to take advantage of some features of GPUs, such as parallel execution and hi-speed constant reference. It supports 32-bit and 64-bit integers, as well as single and double precision floating point as output.
MTGP v1.0 is available now.