This undergraduate thesis and poster by Kajuki Fujiwara and Naohito Nakasato from the University of Aizu approach a common problem in astrophysics: the many-body problem, with both brute-force and hierarchical data structures for solving it on ATI GPUs. Abstracts:

**Fast Simulations of Gravitational Many-body Problem on RV770 GPU
Kazuki Fujiwara, Naohito Nakasato (University of Aizu)
Abstract:**

The gravitational many-body problem is a problem concerning the movement of bodies, which are interacting through gravity. However, solving the gravitational many-body problem with a CPU takes a lot of time due to O(N^2) computational complexity. In this paper, we show how to speed-up the gravitational many-body problem by using GPU. After extensive optimizations, the peak performance obtained so far is about 1 Tflops.

**Oct-tree Method on GPU
N.Nakasato
Abstract:**

The kd-tree is a fundamental tool in computer science. Among others, an application of the kd-tree search (oct-tree method) to fast evaluation of particle interactions and neighbor search is highly important since computational complexity of these problems are reduced from O(N^2) with a brute force method to O(N log N) with the tree method where N is a number of particles. In this paper, we present a parallel implementation of the tree method running on a graphic processor unit (GPU). We successfully run a simulation of structure formation in the universe very efficiently. On our system, which costs roughly $900, the run with N ~ 2.87×10^6 particles took 5.79 hours and executed 1.2×10^13 force evaluations in total. We obtained the sustained computing speed of 21.8 Gflops and the cost per Gflops of 41.6/Gflops that is two and half times better than the previous record in 2006.

There is also page about performance and optimization for the Radeon 5870.

http://galaxy.u-aizu.ac.jp/trac/note/wiki/Tests_With_RV870