You are here: Home » Archives for NVIDIA CUDA
March 11th, 2009
Abstract:
This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors. As a case study, we design and implement the “DCGN” API on NVIDIA GPUs that is similar to MPI and allows full access to the underlying architecture. We introduce the notion of data-parallel thread-groups as a way to map resources to MPI ranks. We use a method that also allows the data-parallel processors to run autonomously from user-written CPU code. In order to facilitate communication, we use a sleep-based polling system to store and retrieve messages. Unlike previous systems, our method provides both performance and flexibility. By running a test suite of applications with different communication requirements, we find that a tolerable amount of overhead is incurred, somewhere between one and five percent depending on the application, and indicate the locations where this overhead accumulates. We conclude that with innovations in chipsets and drivers, this overhead will be mitigated and provide similar performance to typical CPU based MPI implementations while providing fully-dynamic communication.
(Jeff A. Stuart and John D. Owens, Message Passing on Data-Parallel Architectures, Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium)
Posted in Research | Tags: APIs, NVIDIA CUDA, Papers, Programming Languages, Supercomputing, Tools | Write a comment
March 1st, 2009
This IPDPS 2009 paper by Nadathur Satish, Mark Harris, and Michael Garland describes the design of high-performance parallel radix sort and merge sort routines for manycore GPUs, taking advantage of the full programmability offered by NVIDIA CUDA. The radix sort described is the fastest GPU sort and the merge sort described is the fastest comparison-based GPU sort reported in the literature. The radix sort is up to 4 times faster than the graphics-based GPUSort and greater than 2 times faster than other CUDA-based radix sorts. It is also 23% faster, on average, than even a very carefully optimized multicore CPU sorting routine. To achieve this performance, the authors carefully design the algorithms to expose substantial fine-grained parallelism and decompose the computation into independent tasks that perform minimal global communication. They exploit the high-speed on-chip shared memory provided by NVIDIA’s GPU architecture and efficient data-parallel primitives, particularly parallel scan. While targeted at GPUs, these algorithms should also be well-suited for other manycore processors. (N. Satish, M. Harris, and M. Garland. Designing efficient sorting algorithms for manycore GPUs. Proc. 23rd IEEE Int’l Parallel & Distributed Processing Symposium, May 2009. To appear.)
Posted in Research | Tags: Data-Parallel, NVIDIA CUDA, Parallel Algorithms, Sorting | Write a comment
February 27th, 2009
Alexander Heusel of the University of Frankfurt has released open source Java bindings for CUDA. The current project state is alpha, with support for the CUDA driver API, and support for the CUBLAS and CUFFT libraries is pending. Contributions are welcome. For more information see the project website: http://jacuzzi.sourceforge.net
Posted in Developer Resources | Tags: Java, NVIDIA CUDA, Open Source | Write a comment
February 27th, 2009
OpenMM is a freely downloadable, high performance, extensible library that allows molecular dynamics (MD) simulations to run on high performance computer architectures, such as graphics processing units (GPUs). Significant performance speedups of 100 times were achieved in some cases by running OpenMM on GPUs in desktop PCs (vs CPU). The new release includes a version of the widely used MD package GROMACS that integrates the OpenMM library, enabling acceleration on high-end NVIDIA and AMD/ATI GPUs. OpenMM is a collaborative project between Vijay Pande’s lab at Stanford University and Simbios, the National Center for Physics-based Simulation of Biological Structures at Stanford, which is supported by the National Institutes of Health. For more information on OpenMM, go to http://simtk.org/home/openmm. (Full press release.)
Posted in Developer Resources, Press, Research | Tags: AMD, Molecular Dynamics, NVIDIA CUDA | 1 Comment
February 27th, 2009
CUDA.NET 2.1 has been released with support for the NVIDIA CUDA 2.1 API. This version supports DirectX 10 interoperability and the new JIT compilation API. The library is supported on Windows and Linux operating systems. (CUDA.NET)
Posted in Developer Resources | Tags: .NET, APIs, NVIDIA CUDA | Write a comment
February 3rd, 2009
February 5, 2009, 11am PST / 2pm EST
Are you looking for ways to improve your productivity by accelerating MATLAB functions? Now you can with the unprecedented performance of GPU computing.
By attending this webinar, you will learn:
- What is GPU computing
- What is NVIDIA CUDA parallel computing architecture
- What is the Jacket engine for MATLAB from AccelerEyes
- How to get 10x to 50x speed-up for several MATLAB functions
Date: Thursday, February 5, 2009
Time: 11:00am PST / 2:00pm EST
Duration: 45 Minute Presentation, 15 Minute Q&A
Register Here
Presented By: Sumit Gupta, Ph.D., Sr Product Manager of Tesla GPU Computing at NVIDIA and John Melonakos, Ph.D., CEO at AccelerEyes LLC
Posted in Business, Developer Resources, Events | Tags: Courses, MATLAB, NVIDIA CUDA | Write a comment
January 22nd, 2009
NVIDIA announced that National Taiwan University has been named as Asia’s first CUDA Center of Excellence (press release below). The university earned this title by formally adopting NVIDIA GPU Computing solutions across its research facilities and integrating a class to teach parallel computing based on the CUDA architecture into its educational curriculum. As the computing industry rapidly moves toward parallel processing and many-core architectures, over the past year, NVIDIA has worked to offer tomorrow’s developers and engineers education on the best tools and methodologies for parallel computing. In addition to working with over 50 Universities worldwide that are actively using CUDA in their courses, NVIDIA developed the CUDA Center of Excellence Program to further assist universities that are devoted to educating tomorrow’s software developers about parallel computing. (Press Release)
Posted in Press | Tags: NVIDIA CUDA, Research Groups | Write a comment
January 22nd, 2009
From a press release:
SANTA CLARA, CA—JANUARY 15, 2009—NVIDIA today announced it is now working closely with Wipro to provide CUDA™ professional services to their joint customers worldwide. CUDA, NVIDIA’s parallel computing architecture accessible through an industry standard C language programming environment, has already delivered major leaps in performance across many industries. Wipro’s Product Engineering Services group will accelerate the development efforts of companies with vast software portfolios seeking to exploit parallel computing with the GPU.
(Read More)
Posted in Business, Developer Resources, Press | Tags: NVIDIA CUDA | Write a comment
November 18th, 2008
From a press release:
World’s Most Powerful Global Computation Software Now GPU Accelerated
SC08—AUSTIN, TX—NOVEMBER 18, 2008—At SC08, Wolfram Research will demonstrate a new version of Mathematica, the world’s most powerful general computational software, that integrates CUDA®, NVIDIA’s parallel GPU computing architecture. This new version is expected to give Mathematica users an unprecedented performance increase of 10-100X in numerical computing, modeling, simulation and visual computations, without the need to learn or write C code.
“Since its initial release, Mathematica has been adopted by over 3 million professionals across the entire global technical computing community, and it has had a profound effect on how computers are used across many fields,” said Joy Costa, director of global partnerships at Wolfram Research. “The prospect of a hundred fold increase in Mathematica 7 performance is staggering. CUDA enabled Mathematica will revolutionize the world of numerical computation.”
“With Mathematica 7, researchers and scientists can easily tap the enormous parallel processing power of NVIDIA GPU’s through a familiar high level interface,” said Andy Keane, general manager of the GPU Computing business at NVIDIA. This is truly transformative, giving Mathematica users computational horsepower like never before and reducing computation time in some cases from days to a matter of minutes.”
The demonstration of the CUDA-accelerated release of Mathematica coincides with the launch of the NVIDIA® Tesla™ Personal Supercomputer at this year’s SC08. Priced in the range of traditional PC workstations, Tesla Personal Supercomputers are unrivalled in price and performance. Available in configurations of up to 4 Tesla GPUs in a single system, Tesla Personal Supercomputers deliver up to 4 Teraflops of computing performance from up to 960 parallel processing cores.
Read the rest of this entry »
Posted in Business, Developer Resources, Press | Tags: Mathematica, NVIDIA CUDA | Write a comment
November 18th, 2008
CAL.NET is an effort to create a library to allow existing .NET applications access ATI/AMD GPU hardware for computational and graphical purposes. Programmers are able to manage the GPU hardware and execute kernels on it transparently. It is currently supported on Windows and Linux platforms with the latest drivers.
The latest release of CUDA.NET, 2.0.3, addresses issues with the previous release and adds many features including CUDA runtime API support and Direct3D/OpenGL interoperability. It is now possible to create hybrid applications with Tao and SlimDX, and an issue with copying vector data from device memory was fixed on Windows.
Posted in Developer Resources | Tags: AMD CAL, APIs, NVIDIA CUDA | Write a comment
Page 24 of 26« First«...10...2223242526»