<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>GPGPU &#187; Tag: Scientific Computing :: GPGPU.org</title>
	<atom:link href="http://gpgpu.org/tag/scientific-computing/feed" rel="self" type="application/rss+xml" />
	<link>http://gpgpu.org</link>
	<description>General-Purpose Computation on Graphics Hardware</description>
	<lastBuildDate>Tue, 22 May 2012 08:44:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>ViennaCL 1.2.0 released</title>
		<link>http://gpgpu.org/2012/01/02/viennacl-1-2-0-released</link>
		<comments>http://gpgpu.org/2012/01/02/viennacl-1-2-0-released#comments</comments>
		<pubDate>Mon, 02 Jan 2012 09:51:24 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Libraries]]></category>
		<category><![CDATA[Linear Algebra]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=4325</guid>
		<description><![CDATA[Version 1.2.0 of the OpenCL-based C++ linear algebra library ViennaCL is now available for download! It features a high-level interface compatible with Boost.ublas, which allows for compact code and high productivity. Highlights of the new release are the following features (all experimental): Several algebraic multigrid preconditioners Sparse approximate inverse preconditioners Fast Fourier transform Structured dense [...]]]></description>
			<content:encoded><![CDATA[<p>Version 1.2.0 of the OpenCL-based C++ linear algebra library ViennaCL is now available for download! It features a high-level interface compatible with Boost.ublas, which allows for compact code and high productivity. Highlights of the new release are the following features (all experimental):</p>
<ul>
<li>Several algebraic multigrid preconditioners</li>
<li>Sparse approximate inverse preconditioners</li>
<li>Fast Fourier transform</li>
<li>Structured dense matrices (circulant, Hankel, Toeplitz, Vandermonde)</li>
<li>Reordering algorithms (Cuthill-McKee, Gibbs-Poole-Stockmeyer)</li>
<li>Proxies for manipulating subvectors and submatrices</li>
</ul>
<p>The features are expected to reach maturity in the 1.2.x branch. More information about the library including download links is available at <a title="ViennaCL on SourceForge" href="http://viennacl.sourceforge.net/" target="_blank">http://viennacl.sourceforge.net</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2012/01/02/viennacl-1-2-0-released/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to Generic Accelerated Computing with Libra SDK</title>
		<link>http://gpgpu.org/2011/11/30/generic-accelerated-computing-libra-sdk</link>
		<comments>http://gpgpu.org/2011/11/30/generic-accelerated-computing-libra-sdk#comments</comments>
		<pubDate>Wed, 30 Nov 2011 07:35:49 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[Programming Environments]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=4230</guid>
		<description><![CDATA[Libra SDK is a sophisticated runtime including API, sample programs and documentation for massively accelerating software computations. This introduction tutorial provides an overview and usage examples of the powerful Libra API &#38; math libraries executing on x86/x64, OpenCL, OpenGL and CUDA technology. Libra API enables generic and portable CPU/GPU computing within software development without the [...]]]></description>
			<content:encoded><![CDATA[<p>Libra SDK is a sophisticated runtime including API, sample programs and documentation for massively accelerating software computations. This introduction tutorial provides an overview and usage examples of the powerful Libra API &amp; math libraries executing on x86/x64, OpenCL, OpenGL and CUDA technology. Libra API enables generic and portable CPU/GPU computing within software development without the need to create multiple, specific and optimized code paths to support x86, OpenCL, OpenGL or CUDA devices. Link to PDF: <a href="http://www.gpusystems.com/doc/LibraGenericComputing.pdf" target="_blank">www.gpusystems.com/doc/LibraGenericComputing.pdf</a></p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/11/30/generic-accelerated-computing-libra-sdk/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CfP: 20th High Performance Computing Symposium 2012</title>
		<link>http://gpgpu.org/2011/10/07/20th-hpc-2012</link>
		<comments>http://gpgpu.org/2011/10/07/20th-hpc-2012#comments</comments>
		<pubDate>Fri, 07 Oct 2011 09:48:52 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[High-Performance Computing]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=4021</guid>
		<description><![CDATA[The 2012 Spring Simulation Multi-conference will feature the 20th High Performance Computing Symposium (HPC 2012), devoted to the impact of high performance computing and communications on computer simulations. Topics of interest include: high performance/large scale application case studies, GPUs for general purpose computations (GPGPU) multicore and many-core computing, power aware computing, large scale visualization and [...]]]></description>
			<content:encoded><![CDATA[<p>The 2012 Spring Simulation Multi-conference will feature the <a title="link to conference" href="http://www.ncsu.edu/itd/hpc/hpc2012/hpc2012.html" target="_blank">20th High Performance Computing Symposium (HPC 2012)</a>, devoted to the impact of high performance computing and communications on computer simulations. Topics of interest include:</p>
<ul>
<li>high performance/large scale application case studies,</li>
<li>GPUs for general purpose computations (GPGPU)</li>
<li>multicore and many-core computing,</li>
<li>power aware computing,</li>
<li>large scale visualization and data management,</li>
<li>tools and environments for coupling parallel codes,</li>
<li>parallel algorithms and architectures,</li>
<li>high performance software tools,</li>
<li>component technologies for high performance computing.</li>
</ul>
<p>Important dates: Paper submission due: December 2, 2011; Notification of acceptance: January 13, 2012; Revised manuscript due: January 27, 2012; Symposium: March 26&#8211;29, 2012.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/10/07/20th-hpc-2012/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Parallel Smoothers for Matrix-based Multigrid Methods on Unstructured Meshes Using Multicore CPUs and GPUs</title>
		<link>http://gpgpu.org/2011/07/29/parallel-smoothers-for-matrix-based-multigrid-methods-on-unstructured-meshes-using-multicore-cpus-and-gpus</link>
		<comments>http://gpgpu.org/2011/07/29/parallel-smoothers-for-matrix-based-multigrid-methods-on-unstructured-meshes-using-multicore-cpus-and-gpus#comments</comments>
		<pubDate>Fri, 29 Jul 2011 12:01:49 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Multigrid]]></category>
		<category><![CDATA[Numerical Algorithms]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=3799</guid>
		<description><![CDATA[Abstract: Multigrid methods are efficient and fast solvers for problems typically modeled by partial differential equations of elliptic type. For problems with complex geometries and local singularities stencil-type discrete operators on equidistant Cartesian grids need to be replaced by more flexible concepts for unstructured meshes in order to properly resolve all problem-inherent specifics and for [...]]]></description>
			<content:encoded><![CDATA[<p>Abstract:</p>
<blockquote><p>Multigrid methods are efficient and fast solvers for problems typically modeled by partial differential equations of elliptic type. For problems with complex geometries and local singularities stencil-type discrete operators on equidistant Cartesian grids need to be replaced by more flexible concepts for unstructured meshes in order to properly resolve all problem-inherent specifics and for maintaining a moderate number of unknowns. However, flexibility in the meshes goes along with severe drawbacks with respect to parallel execution &#8211; especially with respect to the definition of adequate smoothers. This point becomes in particular pronounced in the framework of fine-grained parallelism on GPUs with hundreds of execution units. We use the approach of matrix-based multigrid that has high flexibility and adapts well to the exigences of modern computing platforms.</p>
<p>In this work we investigate multi-colored Gauss-Seidel type smoothers, the power(q)-pattern enhanced multi-colored ILU(p) smoothers with fill-ins, and factorized sparse approximate inverse (FSAI) smoothers. These approaches provide efficient smoothers with a high degree of parallelism. In combination with matrix-based multigrid methods on unstructured meshes our smoothers provide powerful solvers that are applicable across a wide range of parallel computing platforms and almost arbitrary geometries. We describe the configuration of our smoothers in the context of the portable lmpLAtoolbox and the HiFlow3 parallel finite element package. In our approach, a single source code can be used across diverse platforms including multicore CPUs and GPUs. Highly optimized implementations are hidden behind a unified user interface. Efficiency and scalability of our multigrid solvers are demonstrated by means of a comprehensive performance analysis on multicore CPUs and GPUs.</p></blockquote>
<p>V. Heuveline, D. Lukarski, N. Trost and J.-P. Weiss. <em><a title="Paper PDF link" href="http://www.emcl.kit.edu/preprints/emcl-preprint-2011-09.pdf" target="_blank">Parallel Smoothers for Matrix-based Multigrid Methods on Unstructured Meshes Using Multicore CPUs and GPUs</a></em>. EMCL Preprint Series No. 9. 2011.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/07/29/parallel-smoothers-for-matrix-based-multigrid-methods-on-unstructured-meshes-using-multicore-cpus-and-gpus/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GPIUTMD 0.9.6 released</title>
		<link>http://gpgpu.org/2011/06/26/gpiutmd-0-9-6-released</link>
		<comments>http://gpgpu.org/2011/06/26/gpiutmd-0-9-6-released#comments</comments>
		<pubDate>Sun, 26 Jun 2011 23:16:13 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Molecular Dynamics]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Particle Systems]]></category>
		<category><![CDATA[Physics Simulation]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=3669</guid>
		<description><![CDATA[GPIUTMD stands for Graphic Processors at Isfahan University of Technology for Many-particle Dynamics. It performs general-purpose many-particle dynamic simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to thousands of cores on a fast cluster. Flexible and configurable, GPIUTMD is currently being used for all atom and [...]]]></description>
			<content:encoded><![CDATA[<p><!--[if !mso]&gt;--><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">GPIUTMD stands for <strong>Graphic Processors at Isfahan University of Technology for Many-particle Dynamics</strong>. It performs general-purpose many-particle dynamic simulations on a single workstation, taking advantage of NVIDIA GPUs to attain a level of performance equivalent to thousands of cores on a fast cluster. Flexible and configurable, GPIUTMD is currently being used for all atom and coarse-grained molecular dynamics simulations of nano-materials, glasses, and surfactants; dissipative particle dynamics simulations (DPD) of polymers; and crystallization of metals using EAM potentials. </span><img class="alignright" title="GPIUTMD Logo" src="GPIUTMD%200_files/image002.jpg" alt="" hspace="12" align="right" /><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">GPIUTMD 0.9.6 adds many new features. Highlights include:<span> </span></span></p>
<div class="WordSection1">
<ul type="disc">
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Morse bond potential</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Adding constant acceleration to a group of particles. (useful for modeling gravity effects)</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Computes the full virial stress tensor (useful in mechanical characterization of materials)</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Long-ranged electrostatics via PPPM</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Support for CUDA 3.2</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Theory manual</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">Up to twenty percent boost in simulations</span></li>
<li class="MsoNormal"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">and <a href="http://gpiutmd.iut.ac.ir/index.php/about/features" target="_blank">more</a></span></li>
</ul>
<p class="MsoNormal" style="line-height: normal;"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;">A demo version of GPIUTMD 0.9.6 will be available soon for <a href="http://gpiutmd.iut.ac.ir/index.php/download" target="_blank">download</a> under an open source license. Check out the <a href="http://gpiutmd.iut.ac.ir/index.php/documentation" target="_blank">quick start tutorial</a> to get started, or check out the <a href="http://gpiutmd.iut.ac.ir/index.php/documentation">full documentation</a> to see everything it can do.</span></p>
<p class="MsoNormal">&nbsp;</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/06/26/gpiutmd-0-9-6-released/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GPU computing in medical physics: A review</title>
		<link>http://gpgpu.org/2011/05/29/medical-physics-review</link>
		<comments>http://gpgpu.org/2011/05/29/medical-physics-review#comments</comments>
		<pubDate>Mon, 30 May 2011 01:22:25 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Medical Physics]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=3590</guid>
		<description><![CDATA[Abstract: The graphics processing unit (GPU) has emerged as a competitive platform for computing massively parallel problems. Many computing applications in medical physics can be formulated as data-parallel tasks that exploit the capabilities of the GPU for reducing processing times. The authors review the basic principles of GPU computing as well as the main performance [...]]]></description>
			<content:encoded><![CDATA[<p>Abstract:</p>
<blockquote><p>The graphics processing unit (GPU) has emerged as a competitive platform for computing massively parallel problems. Many computing applications in medical physics can be formulated as data-parallel tasks that exploit the capabilities of the GPU for reducing processing times. The authors review the basic principles of GPU computing as well as the main performance optimization techniques, and survey existing applications in three areas of medical physics, namely image reconstruction, dose calculation and treatment plan optimization, and image processing.</p></blockquote>
<p>(Guillem Pratx &amp; Lei Xing: <em>&#8220;GPU computing in medical physics: A review&#8221;</em>, Med. Phys., vol 38(5), pp. 2685-2698, May 2011. [<a href="http://dx.doi.org/10.1118/1.3578605" target="_blank">DOI</a>])</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/05/29/medical-physics-review/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A memory efficient and fast sparse matrix vector product on a GPU</title>
		<link>http://gpgpu.org/2011/05/04/memory-efficient-fast-spmv</link>
		<comments>http://gpgpu.org/2011/05/04/memory-efficient-fast-spmv#comments</comments>
		<pubDate>Wed, 04 May 2011 10:13:12 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Linear Algebra]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Physics Simulation]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=3513</guid>
		<description><![CDATA[Abstract: This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the [...]]]></description>
			<content:encoded><![CDATA[<p>Abstract:</p>
<blockquote><p>This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, significantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.</p></blockquote>
<p>(A. Dziekonski, A. Lamecki, and M. Mrozowski: &#8220;<em>A memory efficient and fast sparse matrix vector product on a GPU</em>&#8220;, Progress In Electromagnetics Research, Vol. 116, 49-63, 2011. [<a href="http://www.jpier.org/pier/pier.php?paper=11031607" target="_blank">PDF</a>])</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/05/04/memory-efficient-fast-spmv/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>High Throughput Parallel Molecular Dynamics for GPUs</title>
		<link>http://gpgpu.org/2011/04/06/high-throughput-molecular-dynamics</link>
		<comments>http://gpgpu.org/2011/04/06/high-throughput-molecular-dynamics#comments</comments>
		<pubDate>Wed, 06 Apr 2011 23:37:17 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Molecular Dynamics]]></category>
		<category><![CDATA[NVIDIA]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=3443</guid>
		<description><![CDATA[The North Carolina Renaissance Computing Institute (RENCI) is running Amber PMEMD on the Open Science Grid, the high throughput computing (HTC) fabric used by the Large Hadron Collider (LHC). This approach is likely to be helpful to researchers with any of these challenges: Constrained by limited computing resources including access to GPGPUs Manually executing the [...]]]></description>
			<content:encoded><![CDATA[<p>The North Carolina Renaissance Computing Institute (RENCI) is running Amber PMEMD on the Open Science Grid, the high throughput computing (HTC) fabric used by the Large Hadron Collider (LHC). This approach is likely to be helpful to researchers with any of these challenges:</p>
<ol>
<li>Constrained by limited computing resources including access to GPGPUs</li>
<li>Manually executing the same simulation repeatedly with different parameters</li>
<li>Making simulations easier to understand, share, scale and re-use across compute resources</li>
</ol>
<p>For more information see these two blog posts: <a href="http://osglog.wordpress.com/2010/11/04/high-throughput-parallel-molecular-dynamics" target="_blank">High Throughput Parallel Molecular Dynamics</a> and <a href="http://osglog.wordpress.com/2011/02/02/amber11-pmemd-for-nvidia-gpgpu" target="_blank">CUDA/Tesla Accelerated PMEMD on OSG</a>. Contact Steve Cox (scox@renci.org) if you&#8217;d like to discuss further and determine if your application is a fit. If it is, RENCI can provide access to the grid as well as tools for executing and managing simulations.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2011/04/06/high-throughput-molecular-dynamics/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>IMPETUS Afea Solver: A novel Finite Element code adapted to GPU technology</title>
		<link>http://gpgpu.org/2010/10/16/impetus-afea-solver</link>
		<comments>http://gpgpu.org/2010/10/16/impetus-afea-solver#comments</comments>
		<pubDate>Sat, 16 Oct 2010 08:40:34 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Finite Element Methods]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Physics Simulation]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2874</guid>
		<description><![CDATA[IMPETUS Afea is proud to announce the launch of IMPETUS Afea Solver (version 1.0). The IMPETUS Afea Solver is a non-linear explicit finite element tool. It is developed to predict large deformations of structures and components exposed to extreme loading conditions. The tool is applicable to transient dynamics and quasi-static loading conditions. The primary focus of [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;">
<p style="text-align: left;"><a href="http://www.youtube.com/watch?v=NrvuFiDqn5A&amp;feature=player_embedded"></a><a href="http://www.impetus-afea.com" target="_blank">IMPETUS Afea</a> is proud to announce the launch of IMPETUS Afea Solver (version 1.0).</p>
<p>The IMPETUS Afea Solver is a non-linear explicit finite element tool. It is developed to predict large deformations of structures and components exposed to extreme loading conditions. The tool is applicable to transient dynamics and quasi-static loading conditions. The primary focus of the IMPETUS Afea Solver is accuracy, robustness and simplicity for the user. The number of purely numerical parameters that the user has to provide as input is kept at a minimum. The IMPETUS Afea Solver is adapted to GPU technology; utilizing the computational force of a potent graphics card can considerably speed up your calculations.</p>
<p><a href="http://www.youtube.com/watch?v=NrvuFiDqn5A">IMPETUS Afea Solver Video on YouTube</a></p>
<p>For more information or requests please contact sales@impetus-afea.com</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/10/16/impetus-afea-solver/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster</title>
		<link>http://gpgpu.org/2010/06/23/high-order-finite-element-seismic-wave</link>
		<comments>http://gpgpu.org/2010/06/23/high-order-finite-element-seismic-wave#comments</comments>
		<pubDate>Thu, 24 Jun 2010 03:19:58 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Finite Element Methods]]></category>
		<category><![CDATA[High-Performance Computing]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Scientific Computing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2491</guid>
		<description><![CDATA[Abstract: We implement a high-order finite-element application, which performs the numerical simulation of seismic wave propagation resulting for instance from earthquakes at the scale of a continent or from active seismic acquisition experiments in the oil industry, on a large cluster of NVIDIA Tesla graphics cards using the CUDA programming environment and non-blocking message passing [...]]]></description>
			<content:encoded><![CDATA[<p>Abstract:</p>
<blockquote><p>We implement a high-order finite-element application, which performs the numerical simulation of seismic wave propagation resulting for instance from earthquakes at the scale of a continent or from active seismic acquisition experiments in the oil industry, on a large cluster of NVIDIA Tesla graphics cards using the CUDA programming environment and non-blocking message passing based on MPI. Contrary to many finite-element implementations, ours is implemented successfully in single precision, maximizing the performance of current generation GPUs. We discuss the implementation and optimization of the code and compare it to an existing very optimized implementation in C language and MPI on a classical cluster of CPU nodes. We use mesh coloring to efficiently handle summation operations over degrees of freedom on an unstructured mesh, and non-blocking MPI messages in order to overlap the communications across the network and the data transfer to and from the device via PCIe with calculations on the GPU. We perform a number of numerical tests to validate the single-precision CUDA and MPI implementation and assess its accuracy. We then analyze performance measurements and depending on how the problem is mapped to the reference CPU cluster, we obtain a speedup of 20x or 12x.</p></blockquote>
<p>(<a href="http://www.univ-pau.fr/~dkomati1" target="_blank">Dimitri Komatisch</a>, <a href="http://www.sc.fsu.edu/~erlebach" target="_blank">Gordon Erlebacher</a>, <a href="http://www.mathematik.tu-dortmund.de/~goeddeke" target="_blank">Dominik Göddeke</a> and David Michéa: <em>&#8220;High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster&#8221;</em>, accepted for publication in: Journal of Computational Physics, Jun. 2010. <a href="http://web.univ-pau.fr/~dkomati1/published_papers/JCP_multiGPUs_2010.pdf" target="_blank">PDF preprint</a>. <a href="http://dx.doi.org/10.1016/j.jcp.2010.06.024" target="_blank">DOI link</a>.)</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/06/23/high-order-finite-element-seismic-wave/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

