NVIDIA Parallel Nsight Now Shipping

July 21st, 2010

NVIDIA today announced the release of NVIDIA Parallel Nsight software, the industry’s first development environment for GPU-accelerated applications that work with Microsoft Visual Studio.  ”By adding functionality specifically for GPU Computing developers, Parallel Nsight makes the power of the GPU more accessible than ever before,” said Sanford Russell, GM of GPU Computing at NVIDIA. NVIDIA Parallel NSight features a CUDA C/C++ debugger and application performance analyzer, and a graphics debugger and inspector.  NVIDIA Parallel Nsight supports Windows HPC Server 2008, Windows 7 and Windows Vista.  Download Parallel Nsight here.

CULA 2.0 released

July 11th, 2010

EM Photonics announced today the general availability of CULA 2.0, its GPU-accelerated linear algebra library. The new version provides support for NVIDIA GPUs based on the latest “Fermi” architecture.

CULA contains a LAPACK interface comprised of over 150 mathematical routines from the industry standard for computational linear algebra, LAPACK. EM Photonics’ CULA library includes many popular routines including system solvers, least squares solvers, orthogonal factorizations, eigenvalue routines, and singular value decompositions. CULA offers performance up to a magnitude faster than highly optimized CPU-based linear algebra solvers. There is a variety of different interfaces available to integrate directly into your existing code. Programmers can easily call GPU-accelerated CULA from their C/C++, FORTRAN, MATLAB, or Python codes. This can all be done with no GPU programming experience. CULA is available for every system equipped with GPUs based on the NVIDIA CUDA architecture. This includes 32- and 64-bit versions of Linux, Windows, and OS X.

More information is available at www.culatools.com.

gDEBugger V5.6 – Introducing iPhone and iPad on-device debugging and profiling

July 8th, 2010

Graphic Remedy is proud to announce the release of gDEBugger Version 5.6 for Windows, Linux, Mac OS X, iPhone and iPad. This version introduces iPhone and iPad on-device debugging and profiling abilities, letting developers optimize their apps in real-time on actual iPhone and iPad hardware, while viewing invaluable inside information such as the device’s GPU, CPU, graphics driver and operating system performance counters.

gDEBugger is an OpenGL, OpenGL ES and OpenCL debugger and profiler that traces application activity on top of the OpenGL API, and lets programmers see what is happening within the graphics system implementation to find bugs and optimize OpenGL application performance. gDEBugger runs on Windows, Mac OS X, iPhone and Linux operating systems.

Image Processing with CUDA Courses following the GTC

July 4th, 2010

SagivTech plans to offer a 3-days course that deals with Image Processing with CUDA in the USA this September. This is an advanced course that is intended for experienced CUDA developers looking for optimization methods for image processing applications implemented on NVIDIA GPUs.

The course will be held in the San Francisco area, 9am to 5pm September 27-29.

Read the rest of this entry »

3 New Rugged GPGPU products from GE

June 15th, 2010

GE has introduced three new rugged computing products featuring integrated GPGPU technology using NVIDIA CUDA-capable GPUs.  The first is the IPN250 Rugged 6U OpenVPX Single Board Computer (SBC).  The second is the 6U OpenVPX NPN240 multi-processor. The NPN240 features two NVIDIA® CUDA-capable GT240 96-core GPUs, enabling it to deliver up to 750 GFLOP/S peak per card slot (depending on the application). Multiple NPN240s can be linked to one or more hosts to create multi-node CUDA GPU clusters capable of thousands of GFLOP/S.  The third is the OpenVPX-compatible GRA111 high performance graphics board, which is the first rugged implementation of a CUDA-capable GPU.

Libra 1.2 includes new OpenCL back end

June 8th, 2010

GPU Systems LogoGPU Systems has added an OpenCL back end implementation to its Libra Technology compiler and runtime architecture. Libra version 1.2 now supports x86/x64, OpenGL/OpenCL and CUDA compute back ends. The OpenCL back end generates dynamic code specifically for AMD GPUs. Also, the CUDA back end generator has been enhanced with Fermi capabilities and this new release brings full BLAS 1,2,3 matrix, vector, dense, sparse, complex, single/double standard math library functionality and access through a standard C programming interface & library. The high-level approach of the Libra API enables developers to easily extend existing high-level functionality from their favorite programming language.

Read the rest of this entry »

New OpenCL back-end in CAPS HMPP 2.3 hybrid compiler

June 6th, 2010

CAPS has recently added an OpenCL code generator to the just released 2.3 version of its HMPP directive-based hybrid compiler. Also, the CUDA back-end generator has been enhanced with Fermi capabilities and this new release brings support for more native compilers with Intel ifort/icc, GNU gcc/gfortran and PGI pgcc/pgfort compilers, enabling developers to freely use their favorite compiler with HMPP 2.3.

Based on GPU programming and tuning directives, HMPP offers an incremental programming model that allows developers with different levels of expertise to fully exploit GPU hardware accelerators in their legacy code. Read the rest of this entry »

Nexiwave 2.0 GPU-accelerated Speech Indexing

June 3rd, 2010

nexiwave.com, a Speech Indexing Cloud Service company based in Boston MA, announces that it has completed the GPU-acceleration of its speech indexing service, Nexiwave 2.0. Without sacrificing accuracy of its service, nexiwave enjoys over 75% relative speed improvement (comparing a stock Sphinx4 running on a 2.5Ghz/8 core/24GB RAM server to a Sphinx4 on 2.5Ghz/Quad Core/4GB with NVIDIA GTX 470 GPU). Read the rest of this entry »

Mellanox and NVIDIA introduce GPUDirect Technology

June 2nd, 2010

Mellanox and NVIDIA have teamed up to create a solution that enables data sharing (without expensive memory copies) between CUDA-managed host memory and Mellanox Infiniband cards. NVIDIA GPUDirect technology allows application and middleware developers to improve performance by up to 30%, by providing a shared, RDMA-accessible address space between the GPU and the interconnect.

The full press release is available here.

Intel Releases Knights Corner

June 2nd, 2010

At ISC’10, Intel demonstrated their co-processor approach to HPC (formerly known as Larrabee, now codenamed Knights Corner). A prototype of the Intel Many Integrated Core (MIC) architecture with 32 in-order cores, each equipped with a 512-wide vector unit and connected via an on-chip coherent cache, delivered more than half a Teraflop performance for LU decomposition in a live demonstration during a keynote by Kirk Skaugen.

The full press release from ISC’10 is available here.

Page 1 of 712345...Last »