GPU Supercomputer #2 in Top500

May 31st, 2010

The June 2010 Top500 list of the world’s fastest supercomputers was released this week at ISC 2010.  While the US Jaguar supercomputer (located at the Department of Energy’s Oak Ridge Leadership Computing Facility) retained the top spot in Linpack performance, a Chinese cluster called Nebulae, built from a Dawning TC3600 Blade system with Intel X5650 processors and NVIDIA Tesla C2050 GPUs is now the fastest in theoretical peak performance at 2.98 PFlop/s and No. 2 with a Linpack performance of 1.271 PFlop/s. This is the highest rank a GPU-accelerated system, or a Chinese system, has ever achieved on the Top500 list.

For more information, visit www.TOP500.org.

NVIDIA Announces GPU Technology Conference 2010

March 31st, 2010

This year’s GPU Technology Conference (GTC 2010) will take place on Monday, Sept. 20 to Thursday, Sept. 23 at the San Jose Convention Center in San Jose, California.

Building on last year’s inaugural conference, GTC 2010 will feature an even broader and deeper selection of technical sessions, interactive tutorials, technology previews, and industry and academic presentations.

Three concurrent GPU-focused summits will occur under one roof:

  • Emerging Companies Summit: A showcase for innovative startups to demonstrate products and network with VC’s and other investors.
  • GPU Developers Summit: Sessions, tutorials, and presentations for developers, engineers, and scientists.
  • NVIDIA Research Summit: A unique opportunity for students, professors, and researchers to present their findings and collaborate.

For more information:

NVIDIA Launches First Fermi GPUs, the GeForce GTX 400 series

March 31st, 2010

The first GPUs to feature NVIDIA’s new Fermi architecture, the GeForce GTX 480 and 470 GPUs have 480 and 448 CUDA cores, respectively.  From an NVIDIA press release:

SANTA CLARA, California—March 29, 2010—Hot off the heels of PAX East, the consumer gaming show held this past weekend in Boston, NVIDIA today officially launched its new flagship graphics processors, the NVIDIA® GeForce® GTX 480 and GeForce GTX 470.

The top-of-the line in a new family of enthusiast-class GPUs, the GeForce GTX 480 was designed from the ground up to deliver the industry’s most potent tessellation performance, which is the key component of Microsoft’s DirectX 11 development platform for PC games. Tessellation allows game developers to take advantage of the GeForce GTX 480 GPU’s ability to increase the geometric complexity of models and characters to deliver far more realistic and visually compelling gaming environments.

The GeForce GTX 480 is joined by the GeForce GTX 470 as the first products in NVIDIA’s Fermi line of consumer products. They will be available in mid-April, from the world’s leading add-in card partners and PC system builders. The remainder of the GeForce 400-series lineup will be announced in the coming months, filling out additional performance and price segments.

The GeForce GTX 480 and GTX 470 GPUs bring a host of new gaming features never before offered for the PC – including support for real-time ray tracing and NVIDIA 3D Vision™ Surround for truly immersive widescreen, stereoscopic 3D gaming.

NVIDIA Tesla GPUs to Communicate Faster Over Mellanox InfiniBand Networks

November 25th, 2009

From a press release:

New Software Solution Reduces Dependency on CPUs

PORTLAND, Ore.- SC09-Nov. 18, 2009- NVIDIA Corporation (Nasdaq: NVDA) and Mellanox Technologies Ltd. today introduced new software that will increase cluster application performance by as much as 30% by reducing the latency that occurs when communicating over Mellanox InfiniBand to servers equipped with NVIDIA Tesla™ GPUs.

The system architecture of a GPU-CPU server requires the CPU to initiate and manage memory transfers between the GPU and the InfiniBand network. The new software solution will enable Tesla GPUs to transfer data to pinned system memory that a Mellanox InfiniBand solution is able to read and transmit over the network. The result is increased overall system performance and efficiency.

“NVIDIA Tesla GPUs deliver large increases in performance across each node in a cluster, but in our production runs on TSUBAME 1 we have found that network communication becomes a bottleneck when using multiple GPUs,” said Prof. Satoshi Matsuoka from Tokyo Institute of Technology. “Reducing the dependency on the CPU by using InfiniBand will deliver a major boost in performance in high performance GPU clusters, thanks to the work of NVIDIA and Mellanox, and will further enhance the architectural advances we will make in TSUBAME2.0.” Read the rest of this entry »

NVIDIA Announces Next-Generation CUDA GPU Architecture – Codenamed “Fermi”

October 1st, 2009

On September 30th NVIDIA unveiled its latest GPU architecture, codenamed “Fermi”.  The first Fermi GPUs will contain 512 “CUDA Cores”, capable of more than 8x the double precision floating-point throughput of its predecessor, the GT200 GPU.  The GPU also incorporates error correcting (ECC) memories and caches, a new cache hierarchy, increased shared memory and register file sizes, and the ability to execute C++ programs.

From the press release:

SANTA CLARA, Calif. -Sep. 30, 2009- NVIDIA Corp. today introduced its next generation CUDA™ GPU architecture, codenamed “Fermi”. An entirely new ground-up design, the “Fermi”™ architecture is the foundation for the world’s first computational graphics processing units (GPUs), delivering breakthroughs in both graphics and GPU computing.

“NVIDIA and the Fermi team have taken a giant step towards making GPUs attractive for a broader class of programs,” said Dave Patterson, director Parallel Computing Research Laboratory, U.C. Berkeley and co-author of Computer Architecture: A Quantitative Approach. “I believe history will record Fermi as a significant milestone.”

Presented at the company’s inaugural GPU Technology Conference, in San Jose, California, “Fermi” delivers a feature set that accelerates performance on a wider array of computational applications than ever before. Joining NVIDIA’s press conference was Oak Ridge National Laboratorywho announced plans for a new supercomputer that will use NVIDIA® GPUs based on the “Fermi” architecture. “Fermi” also garnered the support of leading organizations including Bloomberg, Cray, Dell, HP, IBM and Microsoft.

Read the rest of this entry »

ATI Radeon™ HD 5800 Series Announced By AMD

October 1st, 2009

AMD announced its latest ATI Radeon™ series of graphics cards on September 23rd.  The new GPUs boast up to 2.72 GFLOP/s of single-precision floating point throughput, along with DirectX® 11 graphics (including DirectCompute) and OpenCL 1.0 support.

From the press release:

AMD (NYSE: AMD) today launched the most powerful processor ever created1, found in its next-generation graphics cards, the ATI Radeon™ HD 5800 series graphics cards, and the world’s first and only to fully support Microsoft DirectX® 112, the new gaming and compute standard shipping shortly with Microsoft Windows® 7operating system. Boasting up to 2.72 TeraFLOPS of compute power, the ATI Radeon™ HD 5800 series effectively doubles the value consumers can expect of their graphics purchases, delivering twice the performance-per-dollar of previous generations of graphics products.3 AMD will initially release two cards: the ATI Radeon HD 5870 and the ATI Radeon HD 5850, each with 1GB GDDR5 memory. With the ATI Radeon™ HD 5800 series of graphics cards, PC users can expand their computing experience with ATI Eyefinity multi-display technology, accelerate their computing experience with ATI Stream technology, and dominate the competition with superior gaming performance and full support of Microsoft DirectX® 11, making it a “must-have” consumer purchase just in time for Microsoft Windows® 7 operating system.

Read the rest of this entry »

CUDA Fortran Compiler Beta Release Now Available

September 29th, 2009

A public beta release of the CUDA-enabled Fortran Compiler from PGI enables programmers to write code in Fortran for NVIDIA CUDA GPUs.  From a press release:

What: NVIDIA today announced that a public beta release of the PGI® CUDA-enabled Fortran compiler is now available. Developed in collaboration with The Portland Group® , it is the first Fortran compiler compatible with NVIDIA® CUDA™ -enabled graphics processing units (GPUs).

compiler is a software tool that translates applications from the high-level programming languages used by software developers into a binary form a computer can execute.

Why: GPU computing with the CUDA C-compiler has gained significant momentum in the High-Performance Computing (HPC) space as it enables developers to get transformative increases in performance with minimal coding required.

Fortran is particularly well suited to numeric computation and scientific computing and remains widely used in a wide range of applications such as weather modeling, computational fluid dynamics and seismic processing.

Where can I get it?: Read the rest of this entry »

NVIDIA Releases Public OpenCL GPU Drivers and Performance Profiler for Windows

September 29th, 2009

NVIDIA has released public OpenCL GPU Drivers and an OpenCL performance profiler for Windows, available for free download fromthe NVIDIA OpenCL Download Page.  From an NVIDIA press release:

SANTA CLARA, Calif. -Sep. 28, 2009- NVIDIA today released the first public OpenCL conformant GPU drivers for Windows and Linux. In addition to the drivers themselves, NVIDIA has released a powerful performance profiling tool and an OpenCL Best Practices Guide.

NVIDIA was the first to release beta OpenCL GPU drivers to developers in April 2009.This public release is fully conformant with the OpenCL v1.0 specification and supports the OpenCL Images features of the specification that, while optional for other vendors, provides significant performance benefits across many image processing disciplines such as medical imaging, video transcoding applications, machine vision and facial detection.

Leveraging the extensive performance instrumentation in NVIDIA’s OpenCL drivers and hardware performance signals designed into NVIDIA GPUs, the OpenCL Visual Profiler provides developers with insight into performance bottlenecks and opportunities for optimization.

Key features include:

Read the rest of this entry »

Interview: NVIDIA’s Ian Buck Talks GPGPU

September 9th, 2009

Tom’s Hardware has published a comprehensive interview with Ian Buck, NVIDIA’s Director of Software for GPU Computing.  In the interview Ian discusses the history of GPGPU (including a referral to our fair site — thanks Ian!) and his work at Stanford on Brook for GPUs.  He then goes on to discuss the development of CUDA and the teams at NVIDIA responsible; OpenCL and tradeoffs between the industry standard API and C for CUDA; and future directions for GPGPU applications.

AMD Announces Beta Release of an OpenCL Implementation for CPUs

August 6th, 2009

AMD is now offering a free OpenCL for CPU beta download as part of the ATI Stream SDK v2.0 Beta Program. The beta will help programmers to more easily develop parallel software programs and take further advantage of multi-core x86 CPUs to accelerate software and deliver a better computing experience. AMD has submitted conformance logs from its Microsoft Windows and Linux CPU beta releases to the Khronos Working Group for certification.

The full press release is available here, and the SDK can be downloaded here.

Page 1 of 512345