GPULib v1.2.2 released

November 25th, 2009

GPULib provides a library of mathematical functions that facilitate the use of high performance computing resources available on modern graphics processing units (GPUs) by engineers, scientists, analysts, and other technical professionals with minimal modification to their existing programs. This software library executes vectorized mathematical functions on graphics processing units (GPUs) from NVIDIA, bringing high-performance numerical operations to everyday desktop computers. By providing bindings for a number of Very High Level Languages (VHLLs) including MATLAB and IDL from ITT Visual Information Solutions, GPULib can accelerate new applications or be incorporated into existing applications with minimal effort. No knowledge of GPU programming and memory management is required. For more information regarding GPULib, please visit http://GPULib.txcorp.com.

NVIDIA Introduces Nexus Integrated GPU/CPU Development Environment for Microsoft Visual Studio

October 4th, 2009

From the press release:

NVIDIA Corp. today introduced NVIDIA® Nexus, the industry’s first development environment for massively parallel computing that is integrated into Microsoft Visual Studio, the world’s most popular development environment for Windows-based solutions and Web applications and services.

“NVIDIA Nexus is going to improve programmer productivity immediately,” said Tarek El Dokor at Edge 3 Technologies. “An integrated GPU and CPU development solution is something Edge 3 has needed for a long time. The fact that it’s integrated into the Visual Studio development environment drastically reduces the learning curve.”

NVIDIA Nexus radically improves productivity by enabling developers of GPU computing applications to use the popular Microsoft Visual Studio-based tools and workflow in a transparent manner, without having to create a separate version of the application that incorporates diagnostic software calls. NVIDIA Nexus also includes the ability to run the code remotely on a different computer. Nexus includes advanced tools for simultaneously analyzing efficiency, performance, and speed of both the graphics processing unit (GPU) and central processing unit (CPU) to give developers immediate insight into how co-processing affects their applications.

Nexus is composed of three components:

Read the rest of this entry »

NVIDIA CUDA Toolkit and SDK version 2.3 Released

July 22nd, 2009

NVIDIA announced today it has released version 2.3 of the CUDA Toolkit and SDK for GPU Computing. This latest release supports several significant new features that deliver a major leap forward in getting the most performance out of NVIDIA’s massively parallel CUDA-enabled GPUs. This release of the CUDA Toolkit includes performance improvements and expanded support for the cuda-gdb hardware debugger.

Additional new features in CUDA Toolkit 2.3 include:

  • The CUFFT Library now supports double-precision transforms and includes significant performance improvements for single-precision transforms as well.  See the CUDA Toolkit release notes for details.
  • The CUDA-GDB hardware debugger and CUDA Visual Profiler are now included in the CUDA Toolkit installer, and the CUDA-GDB debugger is now available for all supported Linux distros.  (see below)
  • Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.
  • The 64-bit versions of the CUDA Toolkit now support compiling 32-bit applications. (See the release notes for details, including changes to LD_LIBRARY_PATH on Linux)
  • New support for fp16 <-> fp32 conversion intrinsics allows storage of data in fp16 format with computation in fp32.  Use of fp16 format is ideal for applications that require higher numerical range than 16-bit integer but less precision than fp32 and reduces memory space and bandwidth consumption.
  • The CUDA SDK has been updated to include: Read the rest of this entry »

CUDA GPU Memtest

June 14th, 2009

CUDA GPU memtest is a memory test utility for NVIDIA GPU memory that uses well-established patterns from memtest86/memtest86+ as well as additional stress tests. The tests are designed to find hard and soft memory errors.

CUDA GPU memtest is  available via anonymous SVN from sourceforge and developed by Guochun Shi and Jeremy Enos.

MemtestG80: A Memory and Logic Tester for NVIDIA CUDA-enabled GPUs

May 25th, 2009

MemtestG80 is a software-based tester to test for “soft errors” in GPU memory or logic for NVIDIA CUDA-enabled GPUs. It uses a variety of proven test patterns (some custom and some based on Memtest86) to verify the correct operation of GPU memory and logic. It is a useful tool to ensure that given GPUs do not produce “silent errors” which may corrupt the results of a computation without triggering an overt error.

Precompiled binaries for Windows, Linux and OSX, as well as the source code, are available for download under the LGPL license. MemtestG80 is developed by Imran Haque and Vijay Pande.

GPUmat: GPU toolbox for MATLAB

May 25th, 2009

GPUmat, developed by the GP-You Group, allows Matlab code to benefit from the compute power of modern GPUs. It is built on top of NVIDIA CUDA. The  acceleration is transparent to the user, only the declaration of variables needs to be changed using new GPU-specific keywords. Algorithms need not be changed. A wide range of standard Matlab functions have been implemented.  GPUmat is available as freeware for Windows and Linux from the GP-You download page.

Message Passing on GPUs and Data-Parallel Architectures

March 11th, 2009

Abstract:

This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors. As a case study, we design and implement the “DCGN” API on NVIDIA GPUs that is similar to MPI and allows full access to the underlying architecture. We introduce the notion of data-parallel thread-groups as a way to map resources to MPI ranks. We use a method that also allows the data-parallel processors to run autonomously from user-written CPU code. In order to facilitate communication, we use a sleep-based polling system to store and retrieve messages. Unlike previous systems, our method provides both performance and flexibility. By running a test suite of applications with different communication requirements, we find that a tolerable amount of overhead is incurred, somewhere between one and five percent depending on the application, and indicate the locations where this overhead accumulates. We conclude that with innovations in chipsets and drivers, this overhead will be mitigated and provide similar performance to typical CPU based MPI implementations while providing fully-dynamic communication.

(Jeff A. Stuart and John D. Owens, Message Passing on Data-Parallel Architectures, Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium)

GPU Programming For The Rest Of Us

March 11th, 2009

This article by Jeff Layton at ClusterMonkey summarizes the history of GPU Computing in terms of high-level programming languages and abstractions, from the early days of GPGPU programming using graphics APIs, to Stream, CUDA and OpenCL. The second half of the article provides an introduction to the PGI 8.0 Technology Preview, which allows the use of pragmas to automatically parallelize and run compute-intensive kernels in standard C and Fortran code on accelerators like GPUs. (GPU Programming For the Rest Of Us, Jeff Layton, ClusterMonkey.net)

gDEBugger for Apple Mac OS X – Beta Program

January 22nd, 2009

Graphic Remedy is proud to announce the upcoming release of gDEBugger for Mac OS X. This new product brings all of gDEBugger’s Debugging and Profiling abilities to the Mac OpenGL developer’s world. Using gDEBugger Mac will help OS X OpenGL developers optimize their application performance: find graphics pipeline bottlenecks, improve application graphics memory consumption, locate and remove redundant OpenGL calls and graphics memory leaks, and much more. Visit the gDebuggerMac home page to join the Beta Program, see screenshots and get more details.

gDEBugger, an OpenGL and OpenGL ES debugger and profiler, traces application activity on top of the OpenGL API, and lets programmers see what is happening within the graphics system implementation to find bugs and optimize OpenGL application performance. gDEBugger runs on Windows, Linux and Mac OS X operating systems.

PGI x64+GPU Fortran & C99 Compilers

October 26th, 2008

The PGI 8.0 release from The Portland Group includes a technology preview of the PGI accelerator programming strategy. PGI 8.0 compilers accept new directives that allow users to select compute intensive regions of Linux x64 Fortran and C99 programs and automatically offload them to an NVIDIA GPU. Until now HPC developers targeting GPU accelerators have had to rely on libraries or language extensions, and use of GPUs from Fortran has been extremely limited. Using the provisional support in PGI Release 8.0, programmers can accelerate Linux applications on x64+NVIDIA platforms by adding OpenMP-like compiler directives to existing high-level standard- compliant Fortran and C99 programs. At Supercomputing 2008 you can see the PGI x64+GPU compilers in action, and learn about PGI’s accelerator programming model and how you can use it to experiment with and embrace accelerated computing. You can also attend the PGI Vendor presentation by Michael Wolfe in room 19A/19B of the Austin convention center on Wednesday, November 19 from 10:30-11:00AM. Also, check out “Compilers and More: Programming GPUs Today” on HPCWire.

Page 4 of 6« First...23456