CUDA 4.1 Released

January 26th, 2012

Today NVIDIA released CUDA 4.1, including a new CUDA Toolkit, SDK, Visual Profiler, Parallel Nsight IDE and NVIDIA device driver.

CUDA 4.1 makes it easier to accelerate scientific research with GPUs with key features including

  • a redesigned Visual Profiler with automated performance analysis and expert guidance;
  • a new LLVM-based compiler that generates up to 10% faster code; and
  • 1000+ new imaging and signal processing functions in the NPP library.

The CuSparse library included with CUDA 4.1 has a new tridiagonal solver and 2x faster sparse matrix-vector multiplication using the ELL hybrid format, and the CuRand library included with CUDA 4.1 has two new random number generators. Read the rest of this entry »

Call for Presentations: AMD Fusion12 Developer Summit

January 26th, 2012

AMD Fusion ’12 will be held June 11-14, 2012 in Bellevue, Washington at the Meydenbauer Center and the Hyatt Regency. AMD invites pioneers of next-generation software and the rapidly growing field of heterogeneous computing to share their latest work and research findings in the form of presentations. Presenters will have an opportunity to advocate new methodologies and paradigms, garner support for industry standards, and network with developers, innovators and academics who will help define the course of this technology. Presentation proposals are invited on the following topics:

  • Web Technologies
  • Cloud Computing – Servers and Data Center
  • Gaming and Consumer Graphics
  • Heterogeneous Computing
  • Innovative Client Experiences
  • Multimedia Processing
  • Professional Graphics and Visual Computing
  • Programming Languages and Models
  • Programming Tools
  • Security Read the rest of this entry »

Submit your poster to GTC 2012 by February 2nd!

January 25th, 2012

Reminder: the deadline to submit a research poster for this year’s GPU Technology Conference is Thursday, February 2, 2012. Selected poster presenters receive a discount to attend GTC. They are required to attend the conference in order to present their work at the GTC Poster Showcase.   GTC will be held May 14-17 in San Jose, California.  For more information, see the call for participation and call for posters. To submit your poster abstract, visit

PyCOOL: Python Cosmological Object-Oriented Lattice code

January 25th, 2012

PyCOOL (Cosmological Object-Oriented Lattice code) is a fast GPU accelerated program that solves the evolution of interacting scalar fields in an expanding universe with symplectic algorithms. The program has been written with the intention to hit a sweet spot of speed, accuracy and user friendliness. This is achieved by using the Python language with the  PyCUDA interface to make a program that is very easy to adapt to different scalar field models.  The program is publicly available under GNU General Public License at. See the PyCOOL website for more information.

CLCC v0.3.0 now available

January 16th, 2012

CLCC, the light-weight and flexible utility for integrating OpenCL source builds into your project has just been updated to version 0.3.0. This version allows developers to save compiled binaries as object files for distribution with their programs and adds a series of options to select specific target platform/device combinations. Documentation and further information is available at

Performance of SpMV in CUSPARSE, CUSP and SpeedIT

January 14th, 2012

The SpeedIt team recently compared and benchmarked the SpMV performance of CUSPARSE 4.0, CUSP 0.2.0 and SpeedIT 2.0 on 23 randomly chosen matrices from University Florida Matrix Collection. Comparisons were done on a Tesla C2050 in single and double precision. The full report is available at

Using GPUs to Accelerate Installed Antenna Performance Simulations

January 9th, 2012


Savant is a asymptotic ray-tracing CEM tool used to predict the performance of antennas installed on electrically large platforms, including far-field antenna patterns, near-field distributions, and antenna-to-antenna coupling. Savant is based on the shooting and bouncing rays (SBR) formulation. While asymptotic solvers like Savant have significantly smaller computational and memory requirements for electrically large problems than full-wave techniques, the computation costs still increase significantly with frequency and simulation fidelity, and such solvers benefit greatly from parallelization techniques. Graphics processing units (GPUs) are throughput-oriented processing devices that are well suited for the mathematically intensive workloads found in CEM solvers. Current GPUs contain hundreds of processing units, leverage thousands of threads, and can execute over one trillion floating-point operations per second. A hybrid CPU and GPU parallelization approach has been developed for Savant, providing significant speedups compared to CPU-only implementations. Results from the execution of GPU-accelerated Savant on multiple case studies will be presented.

(T. Courtney, J. E. Stone and R. Kipp, “Using GPUs to Accelerate installed antenna performance simulations,” Proc. Allerton Antenna Symposium, Sept. 2011, Monticello, IL. [PDF])

Acceleware 4 Day CUDA Course

January 6th, 2012

Partnering with NVIDIA and Microsoft, this four day course is designed for Programmers who are looking to develop comprehensive skills in writing and optimizing applications that fully leverage the multi-core processing capabilities of the GPU.

Delivered by Acceleware’s Developers, who provide real world experience and examples, the training comprises classroom lectures and hands-on tutorials. Each student will be supplied with a laptop equipped with NVIDIA GPUs for the duration of the course. Small class sizes maximize learning and ensure a personal educational experience.

Register before January 13 and receive $250 off your course fee!
Enter promotional code AXTEB2012

CFP: High Performance Graphics 2012

January 6th, 2012

High Performance Graphics is the leading international forum for performance-oriented graphics systems research including innovative algorithms, efficient implementations, and hardware architecture. The conference brings together researchers, engineers, and architects to discuss the complex interactions of massively parallel hardware, novel programming models, efficient graphics algorithms, and novel applications. High Performance Graphics was founded in 2009 to synthesize and broaden on two important and well-respected conferences in computer graphics: Graphics Hardware and Interactive Ray Tracing.

HPG 2012 is co-sponsored by Eurographics and ACM SIGGRAPH and will take place on June 25-27, is co-located with the Eurographics Symposium on Rendering in Paris, France. We invite original and innovative performance-oriented contributions from all areas of graphics, including hardware architectures, rendering, physics, animation, simulation, and data structures, with topics including (but not limited to): Interactive rendering pipelines (hardware or software); Interactive rendering algorithms (hardware or software); Graphics hardware and systems; Languages and compilation; Parallel computing for graphics; and Mobile graphics. Please see the conference website for the full CFP.

CfP: High Performance Simulation of Biological Systems

January 4th, 2012

This workshop is organized by Horacio Pérez-Sánchez and José M. Cecilia and takes place in conjunction with the International Conference on Modeling & Applied Simulation (MAS 2012). The goal is to explore the use of emerging parallel computing architectures as well as High Performance Computing systems (Supercomputers, Clusters, Grids) for the simulation of relevant biological systems. We welcome papers, not submitted elsewhere for review, with a focus in topics of interest ranging from but not limited to:

  • Parallel stochastic simulation
  • Biological and Numerical parallel computing
  • Parallel and distributed architectures
  • Emerging processing architectures (e.g. GPUs, FPGAs, mixed CPU-GPU or CPU-FPGA)
  • Parallel Model checking techniques.
  • Parallel algorithms for biological analysis.
  • Cluster and Grid Deployment for system biology
  • Tools and applications
  • Biologically inspired algorithms.

More details, including dates, deadlines and submission instructions, are available on the workshop web page.

Page 22 of 105« First...10...2021222324...304050...Last »