rCUDA™ 2.0 released

November 27th, 2010

A new major release of rCUDA™ (Remote CUDA), the Open Source package that allows performing CUDA calls to remote GPUs, has been released. The major improvements included in the new version are:

  • Updated API to 3.1
  • Server now uses Runtime API when possible (CUDA >= 3.1 required)
  • Introduced support for the most common CUBLAS routines
  • Fixed some bugs
  • Added AF_UNIX sockets support to enhance performance on local executions
  • Added some load balancing capabilities to the server
  • General performance improvements
  • Officially added Fermi support

Further information is available from the rCUDA™ webpages http://www.gap.upv.es/rCUDA and http://www.hpca.uji.es/rCUDA.

MATLAB Adds GPU Support

October 13th, 2010

Michael Feldman of HPCWire writes:

MATLAB users with a taste for GPU computing now have a perfect reason to move up to the latest version. Release R2010b adds native GPGPU support that allows user to harness NVIDIA graphics processors for engineering and scientific computing. The new capability is provided within the Parallel Computing Toolbox and Distributed Computing Server.

Full details of  MATLAB Release R1020b are available on the Mathworks site.  Information on other numerical packages accelerated using NVIDIA CUDA is available on NVIDIA’s site.

[Editor's Note: as pointed out in the comments by John Melanakos (from Accelereyes),  it may be worth checking out how MATLAB 2010b GPU support currently compares to Accelereyes Jacket.]

MOSIX Virtual OpenCL (VCL)

September 13th, 2010

The MOSIX group announces the availability of the first release of the MOSIX Virtual OpenCL (VCL) package, which allows OpenCL applications to transparently utilize many GPU devices in clusters. In the VCL run-time environment all the cluster devices are seen as if they are located in each hosting-node – applications need not be aware which nodes and devices are available and where the devices are located. As such, VCL benefits OpenCL applications that can use multiple devices concurrently.

VCL can be used to build powerful parallel GPU based clusters from low-cost multi-core hosting nodes that can utilize cluster-wide (CPU and GPU) resources transparently.

The main features of VCL are: Read the rest of this entry »

NVIDIA Parallel Nsight Now Shipping

July 21st, 2010

NVIDIA today announced the release of NVIDIA Parallel Nsight software, the industry’s first development environment for GPU-accelerated applications that work with Microsoft Visual Studio.  ”By adding functionality specifically for GPU Computing developers, Parallel Nsight makes the power of the GPU more accessible than ever before,” said Sanford Russell, GM of GPU Computing at NVIDIA. NVIDIA Parallel NSight features a CUDA C/C++ debugger and application performance analyzer, and a graphics debugger and inspector.  NVIDIA Parallel Nsight supports Windows HPC Server 2008, Windows 7 and Windows Vista.  Download Parallel Nsight here.

OpenCurrent v1.1.0 released

June 18th, 2010

OpenCurrent version 1.1.0 has been released. OpenCurrent is a library for solving certains types of PDEs over 3D cartesian grids. It supports single and double precision, and includes solvers for Poisson equations, diffusion, and incompressible Navier-Stokes.

New features:

  • Multi-GPU communication library
  • Multi-GPU versions of Multigrid solver, Incompressible Navier-Stokes solver, and more
  • NetCDF support now optional
  • Support for Fermi/CUDA 3.0
  • Numerous bug fixes and enhancements

Get it here: http://code.google.com/p/opencurrent/downloads/list

HOOMD-blue 0.9.0 released

May 20th, 2010

HOOMD-blue stands for Highly Optimized Object-oriented Many-particle Dynamics — Blue Edition. It performs general-purpose particle dynamics simulations on a single workstation, taking advantage of  NVIDIA GPUs to attain a level of performance equivalent to dozens of processor cores on a fast cluster.

HOOMD-blue 0.9.0 is a major new release. Highlights include:

  • Support for Fermi generation GPUs
  • Performance enhancements
  • New pair potentials
  • Particle data is now accessible from hoomd scripts
  • Binary format dump files for simulation restarts
  • Numerous small enhancements to enable easily restartable jobs
  • 2D simulations are now possible
  • Integration methods can now be applied to specified groups of particles
  • All IMD commands issued by VMD are now understood
  • and more

HOOMD-blue 0.9.0 is available for download under an open source license.

OpenCL Studio 1.0 beta released

April 5th, 2010

Geist Software Labs has released the first version of OpenCL Studio for beta testing. OpenCL Studio combines OpenCL and OpenGL into a single integrated development environment that allows you to visualize OpenCL computation using powerful 3D rendering techniques. The editor hides much of the complexity of the underlying APIs while still providing flexibility via the Lua scripting language. Integrated source code editors and debugging capabilities for OpenCL, GLSL, and Lua, as well as a toolbox of 2D user interface widgets provide a framework for a wide range of parallel programming solutions.

rCUDA 1.0 released

April 5th, 2010

The GAP (Universidad Politécnica de Valencia, Spain) and HPCA (Universidad Jaume I, Spain) research groups are proud to announce the public release of rCUDA 1.0. The rCUDA Framework enables the concurrent usage of CUDA-compatible devices remotely by employing the sockets API for communication between clients and servers. Thus, it can be useful in three different environments:

  • Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to energy savings, as well as other related savings like acquisition costs, maintenance, space, cooling, etc.
  • Academia. In low performance networks, to offer access to a few high performance GPUs concurrently to all the students.
  • Virtual Machines. To enable the access to the CUDA facilities on the physical machine.

The current version of rCUDA (v1.0) implements all functions in the CUDA Runtime API version 2.3, excluding OpenGL and Direct3D interoperability. rCUDA 1.0 targets the Linux OS (for 32- and 64-bit architectures) on both client and server sides. The framework is free for any purpose under the terms and conditions of the GNU GPL/LGPL (where applicable) licenses.

For additional information, visit the rCUDA web page or Antonio Peña’s webpage.

SpeedIT Toolkit 0.9.1 released

March 26th, 2010

The SpeedIT Tools library provides a set of accelerated solvers for sparse linear systems of equations. Manifold acceleration, e.g. more than an order of magnitude, is achieved with a single reasonably priced NVIDIA Graphics Processing Unit (GPU) that supports CUDA and proprietary advanced optimization techniques. The library can be used in a wide spectrum of domains arising from problems with underlying 2D and 3D geometry, such as computational fluid dynamics, electro-magnetics, thermodynamics, materials, acoustics, computer vision and graphics, robotics, semiconductor devices and structural engineering. The library can be also used for problems without defined geometry such as quantum chemistry, statistics, power networks and other graphs and chemical process simulation. All computations are performed with single or double floating point precision. Two version of SpeedIT toolkit have been released: The classic version provides a conjugate gradient solver, and the extreme edition provides optimized CG, BiCGSTAB, diagonal preconditioner, memory management, and heuristic-based analysis of input matrices.

Palix Technologies launches ANDSolver beta program

March 23rd, 2010

Palix Technologies has introduced a new Computational Fluid Dynamics (CFD) product called ANDSolver that has been designed from the ground up to use Graphics Processing Units (GPUs) for fast and efficient aerodynamic analysis. Although developing and running applications to use multiple CPUs is a well established practice for high performance science and engineering simulations, a newer trend towards using GPUs for computation promises faster results with lower hardware acquisition and operating costs. ANDSolver delivers on that promise with up to a 10x speedup compared to a typical quad core CPU. This level of performance is unique in that it is achieved on unstructured meshes which have traditionally not been considered amenable to GPUs because of the memory access patterns. However, based on an innovative algorithm design to maximize the performance of the NVIDIA CUDA architecture, the ease and flexibility of unstructured meshing can now be used on high-performance, cost-effective GPUs.

A limited number of additional registrants will be accepted prior to our first production release in Q2 2010. More information can be found at http://www.palixtech.com for our current beta testing program.

Page 2 of 612345...Last »