<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>GPGPU &#187; Tag: Debugging :: GPGPU.org</title>
	<atom:link href="http://gpgpu.org/tag/debugging/feed" rel="self" type="application/rss+xml" />
	<link>http://gpgpu.org</link>
	<description>General-Purpose Computation on Graphics Hardware</description>
	<lastBuildDate>Tue, 22 May 2012 08:44:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Panoptes:  A Binary Translation Framework for CUDA</title>
		<link>http://gpgpu.org/2012/05/22/panoptes-a-binary-translation-framework-for-cuda</link>
		<comments>http://gpgpu.org/2012/05/22/panoptes-a-binary-translation-framework-for-cuda#comments</comments>
		<pubDate>Tue, 22 May 2012 08:44:05 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Profiling]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=4724</guid>
		<description><![CDATA[Traditional CPU-based computing environments offer a variety of binary instrumentation frameworks. Instrumentation and analysis tools for GPU environments to date have been more limited. Panoptes is a binary instrumentation framework for CUDA that targets the GPU. By exploiting the GPU to run modified kernels, computationally-intensive programs can be run at the native parallelism of the [...]]]></description>
			<content:encoded><![CDATA[<p>Traditional CPU-based computing environments offer a variety of binary instrumentation frameworks. Instrumentation and analysis tools for GPU environments to date have been more limited. <a title="source code" href="http://github.com/ckennelly/panoptes" target="_blank">Panoptes</a> is a binary instrumentation framework for CUDA that targets the GPU. By exploiting the GPU to run modified kernels, computationally-intensive programs can be run at the native parallelism of the device during analysis. To demonstrate its instrumentation capabilities, we currently implement a memory addressability and validity checker that targets CUDA programs.</p>
<p>Panoptes traces targeted programs by library interposition at runtime. Interactions with the GPU are intercepted, annotated as necessary, and are then sent to the actual CUDA library for execution on the device. This approach gives an analysis tool built on Panoptes a complete view of the state of the GPU without additional developer effort. In contrast, developer-added instrumentation may be incomplete due to errors of omission or cause maintenance difficulties, particularly for large code bases.</p>
<p>By directing annotated instructions to the GPU for execution rather than relying on the host for emulation, Panoptes is able to analyze programs at scale. The rift in parallel execution capabilities between modern GPUs and CPUs carries into testing and debugging as well. For computationally intensive tasks brought to the GPU explicitly for its parallelism, resorting to host-based emulation may necessitate reduced or simplified inputs for analysis. More details: <a title="source code" href="http://github.com/ckennelly/panoptes" target="_blank">http://github.com/ckennelly/panoptes</a></p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2012/05/22/panoptes-a-binary-translation-framework-for-cuda/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CUDA 4.1 Released</title>
		<link>http://gpgpu.org/2012/01/26/cuda-4-1</link>
		<comments>http://gpgpu.org/2012/01/26/cuda-4-1#comments</comments>
		<pubDate>Fri, 27 Jan 2012 04:06:55 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Compilers]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Profiling]]></category>
		<category><![CDATA[Programming Languages]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=4422</guid>
		<description><![CDATA[Today NVIDIA released CUDA 4.1, including a new CUDA Toolkit, SDK, Visual Profiler, Parallel Nsight IDE and NVIDIA device driver. CUDA 4.1 makes it easier to accelerate scientific research with GPUs with key features including a redesigned Visual Profiler with automated performance analysis and expert guidance; a new LLVM-based compiler that generates up to 10% faster [...]]]></description>
			<content:encoded><![CDATA[<p>Today NVIDIA released <a href="http://www.developer.nvidia.com/cuda-toolkit-41" target="_blank">CUDA 4.1</a>, including a new CUDA Toolkit, SDK, Visual Profiler, Parallel Nsight IDE and NVIDIA device driver.</p>
<p>CUDA 4.1 makes it easier to accelerate scientific research with GPUs with key features including</p>
<ul>
<li>a redesigned Visual Profiler with automated performance analysis and expert guidance;</li>
<li>a new LLVM-based compiler that generates up to 10% faster code; and</li>
<li>1000+ new imaging and signal processing functions in the NPP library.</li>
</ul>
<p>The CuSparse library included with CUDA 4.1 has a new tridiagonal solver and 2x faster sparse matrix-vector multiplication using the ELL hybrid format, and the CuRand library included with CUDA 4.1 has two new random number generators. <span id="more-4422"></span> The CUDA 4.1 toolkit also brings some great improvements to its debugging and performance analysis tools.</p>
<p>Sign up for a webinar to learn more about all the new features &amp; high performance GPU-accelerated libraries!</p>
<p>CUDA 4.1 Toolkit 4.1 Feature Overview Webinar</p>
<ul>
<li><a href="https://www2.gotomeeting.com/register/955690146" target="_blank">For Europe and The Americas: 10am (PST), Wednesday, Feb 1</a></li>
<li><a href="  https://www2.gotomeeting.com/register/187844386" target="_blank">For Asia-Pacific and India:  10am (IST) Friday, Feb 3</a></li>
</ul>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2012/01/26/cuda-4-1/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GPU-Ocelot 2.0 Released</title>
		<link>http://gpgpu.org/2011/02/08/gpu-ocelot-2-0-released</link>
		<comments>http://gpgpu.org/2011/02/08/gpu-ocelot-2-0-released#comments</comments>
		<pubDate>Tue, 08 Feb 2011 21:53:17 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[Hardware simulators]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=3238</guid>
		<description><![CDATA[Ocelot 2.0.969 brings CUDA 3.2 and Fermi support to a stable release. Ocelot is a BSD-licensed open source implementation of the CUDA runtime, a PTX emulator, and a mid-level PTX compiler. Here is a feature list for 2.0.969: PTX 2.2 and Fermi device support: Floating point results should be within the ULP limits in the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://gpgpu.org/wp/wp-content/uploads/2011/02/ocelot.jpg"><img class="alignright size-thumbnail wp-image-3239" title="ocelot" src="http://gpgpu.org/wp/wp-content/uploads/2011/02/ocelot-150x150.jpg" alt="GPU-Ocelot Logo" width="150" height="150" /></a>Ocelot 2.0.969 brings CUDA 3.2 and Fermi support to a stable release. Ocelot is a BSD-licensed open source implementation of the CUDA runtime, a PTX emulator, and a mid-level PTX compiler.</p>
<p>Here is a feature list for 2.0.969:</p>
<ul>
<li><strong>PTX 2.2 and Fermi device support:</strong> Floating point results should be within the ULP limits in the PTX ISA manual. Over 500 unit tests verify that the behaviour matches NVIDIA devices.</li>
<li><strong>Four target device types:</strong> A functional PTX emulator. A PTX to LLVM to x86/ARM JIT. A PTX to CAL JIT for AMD devices (beta). A PTX to PTX JIT for NVIDIA devices.</li>
<li><strong>A full-featured PTX 2.2 IR:</strong> An analysis/optimization pass interface over PTX (Control flow graph, dataflow graph, dominator/postdominator trees, structured control tree). Optimizations can be plugged in as modules.</li>
<li><strong>Correctness checking tools:</strong> A memory checker (detects unaligned and out of bounds accesses). A race detector. An interactive debugger (allows stepping through PTX instructions).</li>
<li><strong>An instruction trace analyzer interface:</strong> Allows user-defined modules to receive callbacks when PTX instructions are executed. Can be used to compute metrics over applications or perform correctness checks.</li>
<li><strong>A CUDA API frontend:</strong> Existing CUDA programs can be directly linked against Ocelot. Device pointers can be shared across host threads. Multiple devices can be controlled from the same host thread (cudaSetDevice can be called multiple times).</li>
</ul>
<p>Ocelot is available under a BSD license at <a href="http://code.google.com/p/gpuocelot/" target="_blank">http://code.google.com/p/gpuocelot</a>.</p>
<ul></ul>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/02/08/gpu-ocelot-2-0-released/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>NVIDIA Parallel Nsight Now Shipping</title>
		<link>http://gpgpu.org/2010/07/21/nvidia-parallel-nsight-now-shipping</link>
		<comments>http://gpgpu.org/2010/07/21/nvidia-parallel-nsight-now-shipping#comments</comments>
		<pubDate>Wed, 21 Jul 2010 23:58:30 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Profiling]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2606</guid>
		<description><![CDATA[NVIDIA today announced the release of NVIDIA Parallel Nsight software, the industry’s first development environment for GPU-accelerated applications that work with Microsoft Visual Studio.  &#8221;By adding functionality specifically for GPU Computing developers, Parallel Nsight makes the power of the GPU more accessible than ever before,&#8221; said Sanford Russell, GM of GPU Computing at NVIDIA. NVIDIA [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://gpgpu.org/wp/wp-content/uploads/2010/07/parallel_insight_globe.jpg"><img class="alignright size-full wp-image-2609" title="parallel_insight_globe" src="http://gpgpu.org/wp/wp-content/uploads/2010/07/parallel_insight_globe.jpg" alt="" width="88" height="87" /></a>NVIDIA today announced the release of NVIDIA Parallel Nsight software, the industry’s first development environment for GPU-accelerated applications that work with Microsoft Visual Studio.  &#8221;By adding functionality specifically for GPU Computing developers, Parallel Nsight makes the power of the GPU more accessible than ever before,&#8221; said Sanford Russell, GM of GPU Computing at NVIDIA. NVIDIA Parallel NSight features a CUDA C/C++ debugger and application performance analyzer, and a graphics debugger and inspector.  NVIDIA Parallel Nsight supports Windows HPC Server 2008, Windows 7 and Windows Vista.  <a href="http://www.nvidia.com/object/parallel-nsight.html" target="_blank">Download Parallel Nsight here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/07/21/nvidia-parallel-nsight-now-shipping/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CUDA 3.0 toolkit released</title>
		<link>http://gpgpu.org/2010/03/20/cuda-3-0-toolkit-released</link>
		<comments>http://gpgpu.org/2010/03/20/cuda-3-0-toolkit-released#comments</comments>
		<pubDate>Sat, 20 Mar 2010 10:49:12 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[APIs]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Programming Languages]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2204</guid>
		<description><![CDATA[NVIDIA has released version 3.0 of the CUDA Toolkit, providing developers with tools to prepare for the upcoming Fermi-based GPUs. Highlights of this release include: Support for the new Fermi architecture, with: Native 64-bit GPU support Multiple Copy Engine support ECC reporting Concurrent Kernel Execution Fermi HW debugging support in cuda-gdb Fermi HW profiling support [...]]]></description>
			<content:encoded><![CDATA[<p>NVIDIA has released version 3.0 of the CUDA Toolkit, providing developers with tools to prepare for the upcoming Fermi-based GPUs. Highlights of this release include:</p>
<ul>
<li>Support for the new Fermi architecture, with:
<ul>
<li>Native 64-bit GPU support</li>
<li>Multiple Copy Engine support</li>
<li>ECC reporting</li>
<li>Concurrent Kernel Execution</li>
<li>Fermi HW debugging support in cuda-gdb</li>
<li>Fermi HW profiling support for CUDA C and OpenCL in Visual Profiler</li>
</ul>
</li>
<li>C++ Class Inheritance and Template Inheritance support for increased programmer productivity</li>
<li>A new unified interoperability API for Direct3D and OpenGL, with support for:
<ul>
<li>OpenGL texture interop</li>
<li>Direct3D 11 interop support</li>
<li>CUDA Driver / Runtime Buffer Interoperability, which allows applications using the CUDA Driver API to also use libraries implemented using the CUDA C Runtime such as CUFFT and CUBLAS.</li>
</ul>
</li>
<li><span id="more-2204"></span></li>
<li>CUBLAS now supports all BLAS1, 2, and 3 routines including those for single and double precision complex numbers</li>
<li>Up to 100x performance improvement while debugging applications with cuda-gdb</li>
<li>cuda-gdb hardware debugging support for applications that use the CUDA Driver API</li>
<li>cuda-gdb support for JIT-compiled kernels</li>
<li>New CUDA Memory Checker reports misalignment and out of bounds errors, available as a stand-alone utility and debugging mode within cuda-gdb</li>
<li>CUDA Toolkit libraries are now versioned, enabling applications to require a specific version, support multiple versions explicitly, etc.</li>
<li>CUDA C/C++ kernels are now compiled to standard ELF format</li>
<li>Support for device emulation mode has been packaged in a separate version of the CUDA C Runtime (CUDART), and is deprecated in this release. Now that more sophisticated hardware debugging tools are available and more are on the way, NVIDIA will be focusing on supporting these tools instead of the legacy device emulation functionality.
<ul>
<li>On Windows, use the new Parallel Nsight development environment for Visual Studio, with integrated GPU debugging and profiling tools (was code-named &#8220;Nexus&#8221;). Please see www.nvidia.com/nsight for details.</li>
<li>On Linux, use cuda-gdb and cuda-memcheck, and check out the solutions from Allinea and TotalView that will be available soon.</li>
</ul>
</li>
<li>Support for all the OpenCL features in the latest R195 production driver package:
<ul>
<li>Double Precision</li>
<li>Graphics Interoperability with OpenCL, Direc3D9, Direct3D10, and Direct3D11 for high performance visualization</li>
<li>Query for Compute Capability, so you can target optimizations for GPU architectures (cl_nv_device_attribute_query)</li>
<li>Ability to control compiler optimization settings via support for pragma unroll in OpenCL kernels and an extension that allows programmers to set compiler flags. (cl_nv_compiler_options)</li>
<li>OpenCL Images support, for better/faster image filtering</li>
<li>32-bit global and local atomics for fast, convenient data manipulation</li>
<li>Byte Addressable Stores, for faster video/image processing and compression algorithms</li>
<li>Support for the latest OpenCL spec revision 1.0.48 and latest official Khronos OpenCL headers as of 2010-02-17</li>
</ul>
</li>
</ul>
<p>The toolkit, drivers, tools and documentation are available from <a href="http://developer.nvidia.com/object/cuda_3_0_downloads.html" target="_blank">http://developer.nvidia.com/object/cuda_3_0_downloads.html</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/03/20/cuda-3-0-toolkit-released/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>gDebugger v5.5: AMD (ATI) GPU Performance Counters Integration</title>
		<link>http://gpgpu.org/2010/02/21/gdebugger-v5-5</link>
		<comments>http://gpgpu.org/2010/02/21/gdebugger-v5-5#comments</comments>
		<pubDate>Sun, 21 Feb 2010 23:02:30 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[AMD]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[gDEBugger]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2151</guid>
		<description><![CDATA[Graphic Remedy is proud to announce the release of gDEBugger Version 5.5 for Windows, Linux, Mac OS X and iPhone. This version introduces a powerful AMD GPU performance counters integration, displaying AMD graphic hardware and driver performance counters in gDEBugger&#8217;s Performance Graph and Performance Dashboard views, allowing developers to optimize their application over AMD (ATI) [...]]]></description>
			<content:encoded><![CDATA[<p>Graphic Remedy is proud to announce the release of gDEBugger Version 5.5 for Windows, Linux, Mac OS X and iPhone.</p>
<p>This version introduces a powerful AMD GPU performance counters integration, displaying AMD graphic hardware and driver performance counters in gDEBugger&#8217;s Performance Graph and Performance Dashboard views, allowing developers to optimize their application over AMD (ATI) graphics hardware.</p>
<p>AMD Performance counters are available on Windows, when using ATI Radeon (TM) HD 2000 series or newer with Catalyst (TM) 9.12 or newer.</p>
<p>This version also includes a large number of bug fixes and stability improvements.</p>
<p><span id="more-2151"></span></p>
<h4>Download:</h4>
<p>To download Version 5.5, follow the below links:<br />
gDEBugger Windows: <a href="http://www.gremedy.com/download.php" target="_blank">http://www.gremedy.com/download.php</a><br />
gDEBugger ES: <a href="http://www.gremedy.com/downloadES.php" target="_blank">http://www.gremedy.com/downloadES.php</a><br />
gDEBugger Mac OS X: <a href="http://www.gremedy.com/downloadMac.php" target="_blank">http://www.gremedy.com/downloadMac.php</a><br />
gDEBugger iPhone: <a href="http://www.gremedy.com/downloadiPhone.php" target="_blank">http://www.gremedy.com/downloadiPhone.php</a><br />
gDEBugger Linux: <a href="http://www.gremedy.com/downloadLinux.php" target="_blank">http://www.gremedy.com/downloadLinux.php</a></p>
<p>A 7-day trial version is available to all users.</p>
<p>gDEBugger customers with a valid maintenance package can upgrade their existing products using their current license file free of charge. To update your maintenance plan, click here: <a href="http://www.gremedy.com/maintenance.php" target="_blank">http://www.gremedy.com/maintenance.php</a></p>
<h4>gDEBugger CL beta:</h4>
<p>gDEBugger for OpenCL is now entering its beta phase.<br />
This new product will bring gDEBugger&#8217;s debugging, profiling and memory analysis abilities to the OpenCL developer&#8217;s world.<br />
To join the free beta program and for more details, please visit: <a href="http://www.gremedy.com/gDEBuggerCL.php" target="_blank">http://www.gremedy.com/gDEBuggerCL.php</a></p>
<h4>What&#8217;s New in Version 5.5:</h4>
<ul>
<li>AMD GPU Performance Counters integration, displaying AMD (ATI) graphic hardware and driver performance counters inside gDEBugger&#8217;s Performance Graph and Performance Dashboard Views. Supported counters include:
<ul>
<li>Percentage of time GPU was busy</li>
<li>Percentage of GPU time spent performing depth and stencil tests</li>
<li>Average number of ALU instructions executed in the geometry shader</li>
<li>The number of primitives received by the hardware</li>
<li>The number of primitives passed into the geometry shader</li>
<li>The number of vertices output by the geometry shader</li>
<li>Percentage of texture fetches from a 2D texture</li>
<li>The percentage of GPU time ALU instructions are processed by the fragment shader</li>
<li>Percentage of GPU time the texture cache is stalled</li>
<li>The total number of texels fetched</li>
<li>Texture memory read in bytes</li>
<li>Texture cache miss rate (bytes/texel)</li>
<li>The number of vertices processed by the vertex shader</li>
<li>Percentage of GPU time the depth buffer spends waiting for the color buffer</li>
<li>Number of bytes written to the color buffer</li>
<li>And many other useful counters&#8230;</li>
</ul>
</li>
<li>Performance counter descriptions are now shown in the main frame&#8217;s Properties view</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/02/21/gdebugger-v5-5/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>gDEBugger for OpenCL &#8211; Beta Program</title>
		<link>http://gpgpu.org/2010/02/10/gdebugger-for-opencl-beta-program</link>
		<comments>http://gpgpu.org/2010/02/10/gdebugger-for-opencl-beta-program#comments</comments>
		<pubDate>Thu, 11 Feb 2010 03:34:23 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[gDEBugger]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2138</guid>
		<description><![CDATA[Graphic Remedy is proud to announce the upcoming release of gDEBugger for OpenCL on Windows, Mac OS X and Linux. This new product will bring gDEBugger&#8217;s advanced Debugging, Profiling and Memory Analysis abilities to the OpenCL developer&#8217;s world, helping OpenCL developers find bugs and optimize parallel computing application performance and memory consumption. To join the [...]]]></description>
			<content:encoded><![CDATA[<p>Graphic Remedy is proud to announce the upcoming release of gDEBugger for OpenCL on Windows, Mac OS X and Linux. This new product will bring gDEBugger&#8217;s advanced Debugging, Profiling and Memory Analysis abilities to the OpenCL developer&#8217;s world, helping OpenCL developers find bugs and optimize parallel computing application performance and memory consumption.</p>
<p>To join the Free Beta Program, see screenshots and more details, please visit <a href="http://www.gremedy.com/gDEBuggerCL.php" target="_blank">http://www.gremedy.com/gDEBuggerCL.php</a>.</p>
<p>gDEBugger CL enables OpenCL developers to:</p>
<ul>
<li>Locate parallel computing performance bottlenecks</li>
<li>Edit and continue OpenCL kernels &#8220;on the fly&#8221;<br />
<span id="more-2138"></span></li>
<li>Locate and break on OpenCL errors, function calls, memory leaks and more</li>
<li>View the application&#8217;s OpenCL memory consumption</li>
<li>View OpenCL images and buffers data as an image or as &#8220;raw data&#8221;</li>
<li>View OpenCL command queue activities and timing measurements</li>
<li>Available on Windows, Mac OS X and Linux</li>
<li>And much more&#8230;</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/02/10/gdebugger-for-opencl-beta-program/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NVIDIA Introduces Nexus Integrated GPU/CPU Development Environment for Microsoft Visual Studio</title>
		<link>http://gpgpu.org/2009/10/04/nvidia-nexus-integrated-development-environment</link>
		<comments>http://gpgpu.org/2009/10/04/nvidia-nexus-integrated-development-environment#comments</comments>
		<pubDate>Sun, 04 Oct 2009 22:51:39 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[NVIDIA]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Profiling]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1926</guid>
		<description><![CDATA[From the press release: NVIDIA Corp. today introduced NVIDIA® Nexus, the industry&#8217;s first development environment for massively parallel computing that is integrated into Microsoft Visual Studio, the world&#8217;s most popular development environment for Windows-based solutions and Web applications and services. &#8220;NVIDIA Nexus is going to improve programmer productivity immediately,&#8221; said Tarek El Dokor at Edge [...]]]></description>
			<content:encoded><![CDATA[<p>From the <a href="http://www.nvidia.com/object/pr_nexus_093009.html" target="_blank">press release</a>:</p>
<blockquote><p>NVIDIA Corp. today introduced NVIDIA® Nexus, the industry&#8217;s first development environment for massively parallel computing that is integrated into Microsoft Visual Studio, the world&#8217;s most popular development environment for Windows-based solutions and Web applications and services.</p>
<p>&#8220;NVIDIA Nexus is going to improve programmer productivity immediately,&#8221; said Tarek El Dokor at Edge 3 Technologies. &#8220;An integrated GPU and CPU development solution is something Edge 3 has needed for a long time. The fact that it&#8217;s integrated into the Visual Studio development environment drastically reduces the learning curve.&#8221;</p>
<p>NVIDIA Nexus radically improves productivity by enabling developers of GPU computing applications to use the popular Microsoft Visual Studio-based tools and workflow in a transparent manner, without having to create a separate version of the application that incorporates diagnostic software calls. NVIDIA Nexus also includes the ability to run the code remotely on a different computer. Nexus includes advanced tools for simultaneously analyzing efficiency, performance, and speed of both the graphics processing unit (GPU) and central processing unit (CPU) to give developers immediate insight into how co-processing affects their applications.</p>
<p>Nexus is composed of three components:</p>
<p><span id="more-1926"></span></p>
<ul>
<li>The Nexus Debugger is a source code debugger for GPU source code, such as CUDA C, HLSL and DirectCompute. It supports source breakpoints, data breakpoints and direct GPU memory inspection. All debugging is performed directly on the hardware.</li>
<li>The Nexus Analyzer is a system-wide performance tool for viewing GPU events (kernels, API calls, memory transfers) and CPU events (core allocation, threads and process events and waits)-all on a single, correlated timeline.</li>
<li> The Nexus Graphics Inspector provides developers the ability to debug and profile frames rendered using APIs such as Direct3D. Developers can use the Graphics InspectorT to scrub through draw calls, look at any textures, vertex buffers, and API state in the entire frame.</li>
</ul>
<p>The NVIDIA Nexus supports Windows 7 and Windows Vista operating systems and full integration within Visual Studio (2008 SP1 standard edition or later).</p>
<p>A BETA version of NVIDIA Nexus is scheduled to be available on Oct. 15. For more information on NVIDIA Nexus or to register as a developer, please visit: <a href="http://www.nvidia.com/nexus" target="_blank">www.nvidia.com/nexus</a>. Both standard and professional versions of NVIDIA Nexus will be available upon final release.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/10/04/nvidia-nexus-integrated-development-environment/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GPUocelot &#8211; A binary Translator Framework for GPGPU</title>
		<link>http://gpgpu.org/2009/07/30/gpuocelot</link>
		<comments>http://gpgpu.org/2009/07/30/gpuocelot#comments</comments>
		<pubDate>Thu, 30 Jul 2009 07:06:48 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[Libraries]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Papers]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1761</guid>
		<description><![CDATA[Ocelot, developed at Georgia Tech, seeks to develop a set of tools that enable the low level analysis of GPGPU applications as well a providing a JIT compiler for generic architectures.  Ocelot currently provides an implementation of the NVIDIA CUDA runtime, capable of running the entire CUDA 2.2 and 2.1 SDKs. Ocelot features include a [...]]]></description>
			<content:encoded><![CDATA[<p>Ocelot, developed at Georgia Tech, seeks to develop a set of tools that enable the low level analysis of GPGPU applications as well a providing a JIT compiler for generic architectures.  Ocelot currently provides an implementation of the NVIDIA CUDA runtime, capable of running the entire CUDA 2.2 and 2.1 SDKs.</p>
<p>Ocelot features include a memory checker similar to valgrind, detection mechanisms for non-coalesced memory accesses, full device emulation, and a number of useful debugging and performance tuning features. The <a href="http://code.google.com/p/gpuocelot/wiki/Roadmap" target="_blank">Roadmap</a> lists future developments.</p>
<p>Ocelot is available at <a href="http://code.google.com/p/gpuocelot/" target="_blank">google code</a>, and a number of <a href="http://code.google.com/p/gpuocelot/wiki/References" target="_blank">papers</a> have been published.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/07/30/gpuocelot/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NVIDIA CUDA Toolkit and SDK version 2.3 Released</title>
		<link>http://gpgpu.org/2009/07/22/cuda-2-3</link>
		<comments>http://gpgpu.org/2009/07/22/cuda-2-3#comments</comments>
		<pubDate>Thu, 23 Jul 2009 02:02:09 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[APIs]]></category>
		<category><![CDATA[Debugging]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1755</guid>
		<description><![CDATA[NVIDIA announced today it has released version 2.3 of the CUDA Toolkit and SDK for GPU Computing. This latest release supports several significant new features that deliver a major leap forward in getting the most performance out of NVIDIA’s massively parallel CUDA-enabled GPUs. This release of the CUDA Toolkit includes performance improvements and expanded support [...]]]></description>
			<content:encoded><![CDATA[<p>NVIDIA announced today it has released version 2.3 of the CUDA Toolkit and SDK for GPU Computing. This latest release supports several significant new features that deliver a major leap forward in getting the most performance out of NVIDIA’s massively parallel CUDA-enabled GPUs. This release of the CUDA Toolkit includes performance improvements and expanded support for the cuda-gdb hardware debugger.</p>
<p>Additional new features in CUDA Toolkit 2.3 include:</p>
<ul>
<li>The CUFFT Library now supports double-precision transforms and includes significant performance improvements for single-precision transforms as well.  See the CUDA Toolkit release notes for details.</li>
<li>The CUDA-GDB hardware debugger and CUDA Visual Profiler are now included in the CUDA Toolkit installer, and the CUDA-GDB debugger is now available for all supported Linux distros.  (see below)</li>
<li>Each GPU in an SLI group is now enumerated individually, so compute applications can now take advantage of multi-GPU performance even when SLI is enabled for graphics.</li>
<li>The 64-bit versions of the CUDA Toolkit now support compiling 32-bit applications. (See the release notes for details, including changes to LD_LIBRARY_PATH on Linux)</li>
<li>New support for fp16 &lt;-&gt; fp32 conversion intrinsics allows storage of data in fp16 format with computation in fp32.  Use of fp16 format is ideal for applications that require higher numerical range than 16-bit integer but less precision than fp32 and reduces memory space and bandwidth consumption.</li>
<li>The CUDA SDK has been updated to include:<span id="more-1755"></span>
<ul>
<li>A new pitchLinearTexure code sample that shows how to efficiently texture from pitch linear memory.</li>
<li>A new PTXJIT code sample illustrating how to use cuModuleLoadDataEx() to load PTX source from memory instead of loading a file.</li>
<li>Two new code samples for Windows, showing how to use the NVCUVID library to decode MPEG-2, VC-1, and H.264 content and pass frames to OpenGL or Direct3D for display.</li>
<li>Updated code samples showing how to properly align CUDA kernel function parameters so the same code works on both x32 and x64 systems.</li>
</ul>
</li>
<li>The Visual Profiler includes several enhancements:
<ul>
<li>All memory transfer API calls are now reported</li>
<li>Support for profiling multiple contexts per GPU</li>
<li>Synchronized clocks for requested start time on the CPU and start/end times on the GPU for all kernel launches and memory transfers</li>
<li>Global memory load and store efficiency metrics for GPUs with compute capability 1.2 and higher</li>
</ul>
</li>
<li>The CUDA Driver for MacOS is now packaged separately from the CUDA Toolkit.</li>
<li>Support for major Linux distros, MacOS X, and Windows:
<ul>
<li>MacOS X 10.5.6 and later (32-bit)</li>
<li>Windows XP/Vista/7 with Visual Studio 8 (VC2005 SP1) and 9 (VC2008)</li>
<li>Fedora 10, RHEL 4.7 &amp; 5.3, SLED 10.2 &amp; 11.0, OpenSUSE 11.1, and Ubuntu 8.10 &amp; 9.04</li>
</ul>
</li>
</ul>
<p>Developers can <a href="http://forums.nvidia.com/index.php?showtopic=102548" target="_blank">download the latest CUDA Toolkit, SDK, and drivers now</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/07/22/cuda-2-3/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

