PyCOOL (Cosmological Object-Oriented Lattice code) is a fast GPU accelerated program that solves the evolution of interacting scalar fields in an expanding universe with symplectic algorithms. The program has been written with the intention to hit a sweet spot of speed, accuracy and user friendliness. This is achieved by using the Python language with the PyCUDA interface to make a program that is very easy to adapt to different scalar field models. The program is publicly available under GNU General Public License at. See the PyCOOL website for more information.
PyCOOL: Python Cosmological Object-Oriented Lattice code
January 25th, 2012Jacket v1.8 and LibJacket v1.1 released
July 24th, 2011Jacket 1.8 and LibJacket 1.1 have been released by Accelereyes, enabling GPU support for MATLAB and easier CUDA development with C/C++/Fortran and Python. New features include:
- Expanded support for the Signal Processing, Image Processing, and Statistics Libraries included with both Jacket and LibJacket
- Faster linear algebra for special systems (e.g. symmetric, positive definite, triangular, etc.)
- Enhanced visualizations
- New and updated examples: FDTD, Mandelbrot fractals, maximum-likelihood neural segmentation, MDS for genomics
- Built with CUDA 4.0 for peak performance
Visit http://www.accelereyes.com/ for details, downloads, whitepapers and tutorials.
PyCULA: Python Bindings for CULA GPGPU LAPACK
September 30th, 2010PyCULA is a module providing transparent PyCUDA and ctypes based Python bindings for CULAtools LAPACK by Louis Theran and Garrett Wright of Temple University. It provides support for mixing PyCUDA-style kernel code with CULA device functions and also has a complete set of ctypes wrappers for CULA.
Key Features Include:
- Reduce Memory Leaks by using Automatic Memory Management (via PyCUDA)
- Utilize both simple Numpy style and GPUArray manual device style interfaces.
- Supports mixing LAPACK via CULA with your Custom Kernels.
- Combine seamlessly with handy Python modules like SQL, gzip, SciPy, R, etc.
- Develop, Debug, Optimize, and Get Help right at the interactive command line.
The PyCULA0.9a4 alpha release is avaiable at http://pypi.python.org/pypi/PyCULA/0.9a4. PyCULA was developed as part of the ASU/Temple Zeolite Project, which is supported by CDI-I grant DMR 0835586 to Igor Rivin and M. M. J. Treacy.
CLyther 0.1 Beta Released
April 25th, 2010GeoSpin has released the first version of CLyther for beta testing. Please visit the CLyther SourceForge website for more information. CLyther enables developers to seamlessly write GPGPU code completely in python with no additional syntax. CLyther’s core driver contains a python compiler to convert Python functions and types to OpenCL during runtime.
CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL such as:
- OpenCL interface similar to PyOpenCL
- Dynamic compilation of OpenCL code at runtime
- Fast prototyping of OpenCL code
- Create OpenCL code using the Python language definition
- Passing functions as arguments to OpenCL kernels
- Pure Python emulation mode of kernel functions
Compiling Python to a hybrid execution environment
April 12th, 2010Abstract:
A new compilation framework enables the execution of numerical-intensive applications, written in Python, on a hybrid execution environment formed by a CPU and a GPU. This compiler automatically computes the set of memory locations that need to be transferred to the GPU, and produces the correct mapping between the CPU and the GPU address spaces. Thus, the programming model implements a virtual shared address space. This framework is implemented as a combination of unPython, an ahead-of-time compiler from Python/NumPy to the C++ programming language, and jit4GPU, a just-in-time compiler to the AMD CAL interface using CAL pixel shaders. Jit4GPU includes an optimizer that performs several loop transformations and reduces the number of texture instructions. Experimental evaluation was done on a Radeon 4850 and demonstrates that for some benchmarks the generated GPU code is 50 times faster than generated OpenMP code. The GPU performance also compares favorably with optimized CPU BLAS code for single-precision computations in most cases. Code transformations performed by Jit4GPU on GPU code were also shown to produce considerable speedup compared to unoptimized GPU code.
(Rahul Garg and José Nelson Amaral: “Compiling Python to a Hybrid Execution Environment”. Third Workshop on General-Purpose Computation on Graphics Processing Units, held in conjunction with ASPLOS XV, Pittsburgh, PA, March, 2010. [DOI])
CLyther = Python + OpenCL
March 9th, 2010CLyther is an under-development python tool for OpenCL similar to Cython for C. CLyther is a python language extension intended to make writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features for OpenCL.
CLyther exposes both the OpenCL C library and language to python. It’s features include:
- Fast prototyping of OpenCL code.
- OpenCL kernel function creation using the Python language definition.
- Strong OOP programming in OpenCL code.
- Passing functions as arguments to kernel functions.
- Python emulation mode for OpenCL code.
- Fancy indexing of arrays.
- Dynamic compilation at runtime.
PyCUDA: GPU Run-Time Code Generation for High-Performance Computing
November 25th, 2009Abstract:
High-performance scientific computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), and PyCUDA, an open-source toolkit that supports this technique.
In introducing PyCUDA, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. It is further observed that, compared to competing techniques, the effort required to create codes using run-time code generation with PyCUDA grows more gently in response to growing needs. The concept of RTCG is simple and easily implemented using existing, robust tools. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.
(Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan Catanzaro, Paul Ivanov, Ahmed Fasih. PyCUDA: GPU Run-Time Code Generation for High-Performance Computing, submitted. http://arxiv.org/abs/0911.3456)