NVIDIA today announced the release of NVIDIA Parallel Nsight software, the industry’s first development environment for GPU-accelerated applications that work with Microsoft Visual Studio. ”By adding functionality specifically for GPU Computing developers, Parallel Nsight makes the power of the GPU more accessible than ever before,” said Sanford Russell, GM of GPU Computing at NVIDIA. NVIDIA Parallel NSight features a CUDA C/C++ debugger and application performance analyzer, and a graphics debugger and inspector. NVIDIA Parallel Nsight supports Windows HPC Server 2008, Windows 7 and Windows Vista. Download Parallel Nsight here.
NVIDIA Parallel Nsight Now Shipping
July 21st, 2010CUDA 3.0 toolkit released
March 20th, 2010NVIDIA has released version 3.0 of the CUDA Toolkit, providing developers with tools to prepare for the upcoming Fermi-based GPUs. Highlights of this release include:
- Support for the new Fermi architecture, with:
- Native 64-bit GPU support
- Multiple Copy Engine support
- ECC reporting
- Concurrent Kernel Execution
- Fermi HW debugging support in cuda-gdb
- Fermi HW profiling support for CUDA C and OpenCL in Visual Profiler
- C++ Class Inheritance and Template Inheritance support for increased programmer productivity
- A new unified interoperability API for Direct3D and OpenGL, with support for:
- OpenGL texture interop
- Direct3D 11 interop support
- CUDA Driver / Runtime Buffer Interoperability, which allows applications using the CUDA Driver API to also use libraries implemented using the CUDA C Runtime such as CUFFT and CUBLAS.
- Read the rest of this entry »
gDebugger v5.5: AMD (ATI) GPU Performance Counters Integration
February 21st, 2010Graphic Remedy is proud to announce the release of gDEBugger Version 5.5 for Windows, Linux, Mac OS X and iPhone.
This version introduces a powerful AMD GPU performance counters integration, displaying AMD graphic hardware and driver performance counters in gDEBugger’s Performance Graph and Performance Dashboard views, allowing developers to optimize their application over AMD (ATI) graphics hardware.
AMD Performance counters are available on Windows, when using ATI Radeon (TM) HD 2000 series or newer with Catalyst (TM) 9.12 or newer.
This version also includes a large number of bug fixes and stability improvements.
gDEBugger for OpenCL – Beta Program
February 10th, 2010Graphic Remedy is proud to announce the upcoming release of gDEBugger for OpenCL on Windows, Mac OS X and Linux. This new product will bring gDEBugger’s advanced Debugging, Profiling and Memory Analysis abilities to the OpenCL developer’s world, helping OpenCL developers find bugs and optimize parallel computing application performance and memory consumption.
To join the Free Beta Program, see screenshots and more details, please visit http://www.gremedy.com/gDEBuggerCL.php.
gDEBugger CL enables OpenCL developers to:
- Locate parallel computing performance bottlenecks
- Edit and continue OpenCL kernels “on the fly”
Read the rest of this entry »
NVIDIA Introduces Nexus Integrated GPU/CPU Development Environment for Microsoft Visual Studio
October 4th, 2009From the press release:
NVIDIA Corp. today introduced NVIDIA® Nexus, the industry’s first development environment for massively parallel computing that is integrated into Microsoft Visual Studio, the world’s most popular development environment for Windows-based solutions and Web applications and services.
“NVIDIA Nexus is going to improve programmer productivity immediately,” said Tarek El Dokor at Edge 3 Technologies. “An integrated GPU and CPU development solution is something Edge 3 has needed for a long time. The fact that it’s integrated into the Visual Studio development environment drastically reduces the learning curve.”
NVIDIA Nexus radically improves productivity by enabling developers of GPU computing applications to use the popular Microsoft Visual Studio-based tools and workflow in a transparent manner, without having to create a separate version of the application that incorporates diagnostic software calls. NVIDIA Nexus also includes the ability to run the code remotely on a different computer. Nexus includes advanced tools for simultaneously analyzing efficiency, performance, and speed of both the graphics processing unit (GPU) and central processing unit (CPU) to give developers immediate insight into how co-processing affects their applications.
Nexus is composed of three components:
GPUocelot – A binary Translator Framework for GPGPU
July 30th, 2009Ocelot, developed at Georgia Tech, seeks to develop a set of tools that enable the low level analysis of GPGPU applications as well a providing a JIT compiler for generic architectures. Ocelot currently provides an implementation of the NVIDIA CUDA runtime, capable of running the entire CUDA 2.2 and 2.1 SDKs.
Ocelot features include a memory checker similar to valgrind, detection mechanisms for non-coalesced memory accesses, full device emulation, and a number of useful debugging and performance tuning features. The Roadmap lists future developments.
Ocelot is available at google code, and a number of papers have been published.
NVIDIA CUDA Toolkit and SDK version 2.3 Released
July 22nd, 2009NVIDIA announced today it has released version 2.3 of the CUDA Toolkit and SDK for GPU Computing. This latest release supports several significant new features that deliver a major leap forward in getting the most performance out of NVIDIA’s massively parallel CUDA-enabled GPUs. This release of the CUDA Toolkit includes performance improvements and expanded support for the cuda-gdb hardware debugger.
Additional new features in CUDA Toolkit 2.3 include:
- The CUFFT Library now supports double-precision transforms and includes significant performance improvements for single-precision transforms as well. See the CUDA Toolkit release notes for details.
- The CUDA-GDB hardware debugger and CUDA Visual Profiler are now included in the CUDA Toolkit installer, and the CUDA-GDB debugger is now available for all supported Linux distros. (see below)
- Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.
- The 64-bit versions of the CUDA Toolkit now support compiling 32-bit applications. (See the release notes for details, including changes to LD_LIBRARY_PATH on Linux)
- New support for fp16 <-> fp32 conversion intrinsics allows storage of data in fp16 format with computation in fp32. Use of fp16 format is ideal for applications that require higher numerical range than 16-bit integer but less precision than fp32 and reduces memory space and bandwidth consumption.
- The CUDA SDK has been updated to include: Read the rest of this entry »
gDEBugger 5.2 adds vertex batch statistics and primitive counters
July 16th, 2009Graphic Remedy is proud to announce the release of gDEBugger Version 5.2 for Windows, Mac OS X, iPhone and Linux. Version 5.2 adds a new Vertex Batch Statistics view to the gDEBugger Statistics viewer. OpenGL draw function calls are grouped by the number of vertices they push into the graphics pipeline, allowing the user to view and improve the ratio between API calls made and vertices drawn.
In addition, this new release introduces OpenGL primitive performance counters, displaying the total number of primitives and vertices drawn per frame, as well as a breakdown to the specific primitive types (points, lines and triangles).
gDebugger version 5.2 also includes a public beta release of gDEBugger iPhone.
gDEBugger for Apple Mac OS X launched at GDC 2009
March 31st, 2009Graphic Remedy launched the first official version of gDEBugger Mac at this year’s Game Developers Conference, held in San Francisco, 23-27 March. On Tuesday March 24, gDEBugger Mac was demonstrated in the Khronos Developer University full-day tutorial area. A fully functional trial version of gDEBugger Mac is now available for download.
gDEBugger is an OpenGL Debugger and Profiler. It traces application activity on top of the OpenGL API, lets programmers see what is happening within the graphics system implementation to find bugs and optimize OpenGL application performance.
gDEBugger Mac brings all of gDEBugger’s Debugging and Profiling abilities to the Mac OS X OpenGL developer’s world. gDEBugger now runs on Windows, Mac OS X and Linux operating systems.
gDEBugger V4.5 Adds the ability to view Texture Mipmap levels and Texture Arrays
February 27th, 2009The new gDEBugger V4.5 adds the ability to view texture MIP-map levels. Each texture MIP-map level’s parameters and data (as an image or raw data) can be displayed in the gDEBugger Texture and Buffers viewer. Browse the different MIP-map levels using the Texture MIP-map Level slidergDEBugger V4.5 also introduces support for 1D and 2D texture arrays. The new Textures and Buffers viewer Texture Layer slider enables viewing the contents of different texture layers. This version also introduces notable performance and stability improvements.
gDEBugger, an OpenGL and OpenGL ES debugger and profiler, traces application activity on top of the OpenGL API and lets programmers see what is happening within the graphics system implementation to find bugs and optimize OpenGL application performance. gDEBugger runs on Windows and Linux operating systems, and is currently in Beta phase on Mac OS X.