<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>GPGPU &#187; Tag: Supercomputing :: GPGPU.org</title>
	<atom:link href="http://gpgpu.org/tag/supercomputing/feed" rel="self" type="application/rss+xml" />
	<link>http://gpgpu.org</link>
	<description>General-Purpose Computation on Graphics Hardware</description>
	<lastBuildDate>Tue, 22 May 2012 08:44:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Accelerate Your Science on the Titan Supercomputer</title>
		<link>http://gpgpu.org/2012/04/01/accelerate-your-science-titan-supercomputer</link>
		<comments>http://gpgpu.org/2012/04/01/accelerate-your-science-titan-supercomputer#comments</comments>
		<pubDate>Mon, 02 Apr 2012 02:35:39 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Call for Proposals]]></category>
		<category><![CDATA[Supercomputing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=4608</guid>
		<description><![CDATA[Accelerate your science on the Titan Supercomputer later this year, by harnessing up to 20 petaflops of parallel processing using GPUs. Open to researchers from academia, government labs, and industry, the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program is the major means by which the scientific community gains access to some of [...]]]></description>
			<content:encoded><![CDATA[<p>Accelerate your science on the Titan Supercomputer later this year, by harnessing up to 20 petaflops of parallel processing using GPUs. Open to researchers from academia, government labs, and industry, the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program is the major means by which the scientific community gains access to some of the fastest supercomputers.</p>
<p>First, let INCITE know you are interested in GPU acceleration by completing a <a href="https://hpc.science.doe.gov/allocations/incite/" target="_blank">two-minute survey</a>. Then determine if you want to submit a formal proposal by June 27, 2012.</p>
<p>Need help drafting your proposal? <a href="https://www.alcf.anl.gov/incite2013" target="_blank">Attend a “how-to” webinar</a> on Tuesday, April 24 to learn tips and tricks for drafting your proposal. For further questions about the call for proposals, please contact the INCITE manager at INCITE@DOEleadershipcomputing.org.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2012/04/01/accelerate-your-science-titan-supercomputer/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3 of the 5 fastest supercomputers in the world use GPUs</title>
		<link>http://gpgpu.org/2010/11/17/gpus-in-3-of-5-fastest-supercomputers</link>
		<comments>http://gpgpu.org/2010/11/17/gpus-in-3-of-5-fastest-supercomputers#comments</comments>
		<pubDate>Thu, 18 Nov 2010 04:10:02 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Press]]></category>
		<category><![CDATA[NVIDIA Tesla]]></category>
		<category><![CDATA[Supercomputing]]></category>
		<category><![CDATA[Top500]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2986</guid>
		<description><![CDATA[The latest Top 500 list of the world&#8217;s fastest supercomputers, released November 15th, demonstrates that GPUs are being adopted on a large scale in the HPC space.  Three out of the top 5 machines (#1 and #3 in China, and #4 in Japan) feature NVIDIA Tesla GPUs.  Also, the list confirms the expected result that [...]]]></description>
			<content:encoded><![CDATA[<p>The latest Top 500 list of the world&#8217;s fastest supercomputers, released November 15th, demonstrates that GPUs are being adopted on a large scale in the HPC space.  Three out of the top 5 machines (<a href="http://top500.org/system/10587">#1</a> and <a href="http://top500.org/system/10484" target="_blank">#3</a> in China, and <a href="http://top500.org/system/10588">#4</a> in Japan) feature NVIDIA Tesla GPUs.  Also, the list confirms the expected result that the new GPU-based <a href="http://top500.org/system/10587">Tianhe-1a</a> machine from China has ousted <a href="http://top500.org/system/10184">Jaguar</a> from the top spot.</p>
<p><a href="http://top500.org/lists/2010/11/press-release">More details at top500.org</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/11/17/gpus-in-3-of-5-fastest-supercomputers/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NVIDIA Tesla GPUs Power World&#8217;s Fastest Supercomputer</title>
		<link>http://gpgpu.org/2010/10/28/nvidia-tesla-gpus-power-worlds-fastest-supercomputer</link>
		<comments>http://gpgpu.org/2010/10/28/nvidia-tesla-gpus-power-worlds-fastest-supercomputer#comments</comments>
		<pubDate>Fri, 29 Oct 2010 00:08:43 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Press]]></category>
		<category><![CDATA[NVIDIA]]></category>
		<category><![CDATA[Supercomputing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/2010/10/28/nvidia-tesla-gpus-power-worlds-fastest-supercomputer</guid>
		<description><![CDATA[From a press release: SANTA CLARA, CA &#8212; (Marketwire) &#8212; 10/28/2010 &#8212; Tianhe-1A, a new supercomputer revealed today at HPC 2010 China, has set a new performance record of 2.507 petaflops, as measured by the LINPACK benchmark, making it the fastest system in China and in the world today. Tianhe-1A epitomizes modern heterogeneous computing by [...]]]></description>
			<content:encoded><![CDATA[<p>From a <a href="http://pressroom.nvidia.com/easyir/customrel.do?easyirid=A0D622CE9F579F09&#038;version=live&#038;prid=678988&#038;releasejsp=release_157">press release</a>:</p>
<blockquote><p>SANTA CLARA, CA &#8212; (Marketwire) &#8212; 10/28/2010 &#8212; Tianhe-1A, a new supercomputer revealed today at <a href="http://www.bcc.ac.cn/hpc/index.html">HPC 2010 China</a>, has set a new performance record of 2.507 petaflops, as measured by the LINPACK benchmark, making it the fastest system in China and in the world today.</p>
<p>Tianhe-1A epitomizes modern heterogeneous computing by coupling massively parallel GPUs with multi-core CPUs, enabling significant achievements in performance, size and power. The system uses 7,168 NVIDIA® Tesla™ M2050 GPUs and 14,336 CPUs; it would require more than 50,000 CPUs and twice as much floor space to deliver the same performance using CPUs alone.<br />
<span id="more-2931"></span></p>
<p>More importantly, a 2.507 petaflop system built entirely with CPUs would consume more than 12 megawatts. Thanks to the use of GPUs in a heterogeneous computing environment, Tianhe-1A consumes only 4.04 megawatts, making it 3 times more power efficient &#8212; the difference in power consumption is enough to provide electricity to over 5000 homes for a year.</p>
<p>Tianhe-1A was designed by the National University of Defense Technology (NUDT) in China. The system is housed at National Supercomputer Center in Tianjin and is already fully operational.</p>
<p>&#8220;The performance and efficiency of Tianhe-1A was simply not possible without GPUs,&#8221; said Guangming Liu, chief of National Supercomputer Center in Tianjin. &#8220;The scientific research that is now possible with a system of this scale is almost without limits; we could not be more pleased with the results.&#8221;</p>
<p>The Tianhe-1A supercomputer will be operated as an open access system to use for large scale scientific computations.</p>
<p>&#8220;GPUs are redefining high performance computing,&#8221; said Jen-Hsun Huang, president and CEO of NVIDIA. &#8220;With the Tianhe-1A, GPUs now power two of the top three fastest computers in the world today. These GPU supercomputers are essential tools for scientists looking to turbocharge their rate of discovery.&#8221;</p>
<p>NVIDIA Tesla GPUs, based on the CUDA™ parallel computing architecture, are designed specifically for high performance computing (HPC) environments and deliver transformative performance increases across a wide range of HPC fields, including drug discovery, hurricane and tsunami modeling, cancer research, car design, even studying the formation of galaxies.</p>
<p>For more information on NVIDIA Tesla high performance GPU computing products, go <a href="http://www.nvidia.com/object/tesla_computing_solutions.html">here</a>.</p></blockquote>
<p><br/><br/><img src="http://gpgpu.org/wp/wp-content/uploads/2010/10/20101029-081101.jpg" alt="" width="240" height="180" class="alignnone size-full" /></p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/10/28/nvidia-tesla-gpus-power-worlds-fastest-supercomputer/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>GPU Supercomputer #2 in Top500</title>
		<link>http://gpgpu.org/2010/05/31/gpu-supercomputer-2-in-top500</link>
		<comments>http://gpgpu.org/2010/05/31/gpu-supercomputer-2-in-top500#comments</comments>
		<pubDate>Tue, 01 Jun 2010 01:49:29 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Press]]></category>
		<category><![CDATA[Supercomputing]]></category>
		<category><![CDATA[Top500]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2369</guid>
		<description><![CDATA[The June 2010 Top500 list of the world&#8217;s fastest supercomputers was released this week at ISC 2010.  While the US Jaguar supercomputer (located at the Department of Energy&#8217;s Oak Ridge Leadership Computing Facility) retained the top spot in Linpack performance, a Chinese cluster called Nebulae, built from a Dawning TC3600 Blade system with Intel X5650 processors [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://top500.org/lists/2010/06" target="_blank">June 2010 Top500 list</a> of the world&#8217;s fastest supercomputers was released this week at <a href="http://www.supercomp.de/isc10/" target="_blank">ISC 2010</a>.  While the US Jaguar supercomputer (located at the Department of Energy&#8217;s Oak Ridge Leadership Computing Facility) retained the top spot in Linpack performance, a Chinese cluster called Nebulae, built from a Dawning TC3600 Blade system with <a href="http://ark.intel.com/Product.aspx?id=47922" target="_blank">Intel X5650</a> processors and <a href="http://www.nvidia.com/object/product_tesla_C2050_C2070_us.html" target="_blank">NVIDIA Tesla C2050 GPUs</a> is now the fastest in theoretical peak performance at 2.98 PFlop/s and No. 2 with a Linpack performance of 1.271 PFlop/s. This is the highest rank a GPU-accelerated system, or a Chinese system, has ever achieved on the Top500 list.</p>
<p>For more information, visit <a href="http://www.top500.org/" target="_blank">www.TOP500.org</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2010/05/31/gpu-supercomputer-2-in-top500/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Supercomputing 2009 birds-of-a-feather session on &#8220;The Art of Performance Tuning for CUDA and Manycore Architectures&#8221;</title>
		<link>http://gpgpu.org/2009/12/02/supercomputing-2009-performance-tuning-for-cuda</link>
		<comments>http://gpgpu.org/2009/12/02/supercomputing-2009-performance-tuning-for-cuda#comments</comments>
		<pubDate>Thu, 03 Dec 2009 00:41:17 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Birds-of-a-Feather]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Supercomputing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=2036</guid>
		<description><![CDATA[High throughput architectures for HPC seem likely to emphasize many cores with deep multithreading, wide SIMD, and sophisticated memory hierarchies. GPUs present one example, and their high throughput has led a number of researchers to port computationally intensive applications to NVIDIA&#8217;s CUDA architecture. This session explored the art of performance tuning for CUDA using several [...]]]></description>
			<content:encoded><![CDATA[<p>High throughput architectures for HPC seem likely to emphasize many cores with deep multithreading, wide SIMD, and sophisticated memory hierarchies. GPUs present one example, and their high throughput has led a number of researchers to port computationally intensive applications to NVIDIA&#8217;s CUDA architecture.</p>
<p><a href="http://www.cs.virginia.edu/~skadron/Papers/cuda_tuning_bof_sc09_final.pdf" target="_blank">This session</a> explored the art of performance tuning for CUDA using several case studies. Topics included profiling to identify bottlenecks, effective use of the GPU&#8217;s memory hierarchy and DRAM interface to maximize bandwidth, data versus task parallelism, and avoiding SIMD divergence.  Many of the lessons learned in the context of CUDA are likely to apply to other many-core architectures used in HPC applications.</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/12/02/supercomputing-2009-performance-tuning-for-cuda/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Supercomputing 2009 Tutorial: High-Performance Computing with CUDA</title>
		<link>http://gpgpu.org/2009/11/30/sc2009-cuda-tutorial</link>
		<comments>http://gpgpu.org/2009/11/30/sc2009-cuda-tutorial#comments</comments>
		<pubDate>Tue, 01 Dec 2009 04:54:34 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Developer Resources]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[High-Performance Computing]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Supercomputing]]></category>
		<category><![CDATA[Tutorials & Courses]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1994</guid>
		<description><![CDATA[The presentation slides from the Supercomputing 2009 full-day tutorial &#8220;High-Performance Computing with CUDA&#8221; are now available at http://gpgpu.org/sc2009. Abstract: NVIDIA’s CUDA is a general-purpose architecture for writing highly parallel applications. CUDA provides several key abstractions—a hierarchy of thread blocks, shared memory, and barrier synchronization—for scalable high-performance parallel computing. Scientists throughout industry and academia use CUDA [...]]]></description>
			<content:encoded><![CDATA[<p>The presentation slides from the <a href="http://sc09.supercomputing.org/" target="_blank">Supercomputing 2009</a> full-day tutorial &#8220;High-Performance Computing with CUDA&#8221; are now available at <a href="http://gpgpu.org/sc2009">http://gpgpu.org/sc2009</a>.</p>
<p>Abstract:</p>
<blockquote><p>NVIDIA’s CUDA is a general-purpose architecture for writing highly parallel applications. CUDA provides several key abstractions—a hierarchy of thread blocks, shared memory, and barrier synchronization—for scalable high-performance parallel computing. Scientists throughout industry and academia use CUDA to achieve dramatic speedups on production and research codes. The CUDA architecture supports many languages, programming environments, and libraries including C, Fortran, OpenCL, DirectX Compute, Python, Matlab, FFT, LAPACK, etc.</p>
<p>In this tutorial NVIDIA engineers will partner with academic and industrial researchers to present CUDA and discuss its advanced use for science and engineering domains. The morning session will introduce CUDA programming, motivate its use with many brief examples from different HPC domains, and discuss tools and programming environments. The afternoon will discuss advanced issues such as optimization and sophisticated algorithms/data structures, closing with real-world case studies from domain scientists using CUDA for computational biophysics, fluid dynamics, seismic imaging, and theoretical physics.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/11/30/sc2009-cuda-tutorial/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CfP: International Conference on Supercomputing (ICS&#8217;10)</title>
		<link>http://gpgpu.org/2009/11/30/cfp-ics2010</link>
		<comments>http://gpgpu.org/2009/11/30/cfp-ics2010#comments</comments>
		<pubDate>Tue, 01 Dec 2009 00:58:04 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[High-Performance Computing]]></category>
		<category><![CDATA[Supercomputing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1976</guid>
		<description><![CDATA[24th International Conference on Supercomputing (ICS&#8217;10) June 1-4, 2010 Epochal Tsukuba (Tsukuba International Congress Center) Tsukuba, Japan Sponsored by ACM/SIGARCH ICS is the premier international forum for the presentation of research results in high-performance computing systems.  In 2010 the conference will be held at the Epochal Tsukuba (Tsukuba International Congress Center) in Tsukuba City, the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.ics-conference.org/" target="_blank">24th International Conference on Supercomputing (ICS&#8217;10)</a><br />
June 1-4, 2010<br />
<a href="http://www.epochal.or.jp/eng/" target="_blank">Epochal Tsukuba (Tsukuba International Congress Center)</a><br />
Tsukuba, Japan<br />
Sponsored by ACM/SIGARCH</p>
<p>ICS is the premier international forum for the presentation of research results in high-performance computing systems.  In 2010 the conference will be held at the Epochal Tsukuba (Tsukuba International Congress Center) in Tsukuba City, the largest high-tech and academic<br />
city in Japan.</p>
<p>Papers are solicited on all aspects of research, development, and application of high-performance experimental and commercial systems. Special emphasis will be given to work that leads to better understanding of the implications of the new era of million-scale parallelism and Exa-scale performance; including (but not limited to):<span id="more-1976"></span></p>
<ul>
<li>Computationally challenging scientific and commercial applications: studies and experiences to exploit ultra large scale parallelism, a large number of accelerators, and/or cloud computing paradigm.</li>
<li>High-performance computational and programming models: studies and proposals of new models, paradigms and languages for scalable application development, seamless exploitation of accelerators, and grid/cloud computing.</li>
<li>Architecture and hardware aspects: processor, accelerator, memory, interconnection network, storage and I/O architecture to make future systems scalable, reliable and power efficient.</li>
<li>Software aspects: compilers and runtime systems, programming and development tools, middleware and operating systems to enable us to scale applications and systems easily, efficiently and reliably.</li>
<li>Performance evaluation studies and theoretical underpinnings of any of the above topics, especially those giving us perspective toward future generation high-performance computing.</li>
<li>Large scale installations in the Petaflop era: design, scaling, power, and reliability, including case studies and experience reports, to show the baselines for future systems.</li>
</ul>
<p>In order to encourage open discussion on future directions, the program committee will provide higher priority for papers that present highly innovative and challenging ideas.</p>
<p>Papers should not exceed 6,000 words, and should be submitted electronically, in PDF format using the ICS&#8217;10 submission web site. Submissions should be blind.  The review process will include a rebuttal period. Please refer to the ICS&#8217;10 web site for detailed instructions.</p>
<p>Workshop and tutorial proposals are also be solicited and due by January 18, 2010.  For further information and future updates, refer to the ICS&#8217;10 web site at <a href="http://www.ics-conference.org/" target="_blank">http://www.ics-conference.org</a> or contact the General Chair (<a href="mailto:ics10-chair@hpcs.cs.tsukuba.ac.jp">ics10-chair@hpcs.cs.tsukuba.ac.jp</a>) or Program Co-Chairs (<a href="mailto:ics10-chairs@ac.upc.edu">ics10-chairs@ac.upc.edu</a>).</p>
<p><strong>Important Dates</strong></p>
<ul>
<li>Abstract submission:  January 11, 2010</li>
<li>Paper submission:     January 18, 2010</li>
<li>Author notification:  March 22, 2010</li>
<li>Final papers:         April 15, 2010</li>
</ul>
<p>For more information, please visit the conference web site at <a href="http://www.ics-conference.org/" target="_blank">http://www.ics-conference.org</a></p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/11/30/cfp-ics2010/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Many-Core Hardware to Correlate Radio Astronomy Signals</title>
		<link>http://gpgpu.org/2009/08/26/many-core-radio-astronomy</link>
		<comments>http://gpgpu.org/2009/08/26/many-core-radio-astronomy#comments</comments>
		<pubDate>Thu, 27 Aug 2009 02:51:21 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[Astronomy]]></category>
		<category><![CDATA[Cross-correlation]]></category>
		<category><![CDATA[Radio Astronomy]]></category>
		<category><![CDATA[Supercomputing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1821</guid>
		<description><![CDATA[Abstract: A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data streams are cross-correlated to filter out noise. This is especially challenging, since the computational demands grow quadratically with the number of data streams. Moreover, the correlator is not only computationally intensive, [...]]]></description>
			<content:encoded><![CDATA[<p>Abstract:</p>
<blockquote><p>A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope.  The enormous data streams are cross-correlated to filter out noise.  This is especially challenging, since the computational demands grow quadratically with the number of data streams. Moreover, the correlator is not only computationally intensive, but also very I/O intensive. The LOFAR telescope, for instance, will produce over 100 terabytes per day. The future SKA telescope will even require in the order of exaflops, and petabits/s of I/O.  A recent trend is to correlate in software instead of dedicated hardware.  This is done to increase flexibility and to reduce development efforts.  Examples include e-VLBI and LOFAR.</p>
<p>In this paper, we evaluate the correlator algorithm on multi-core CPUs and many-core architectures, such as NVIDIA and ATI GPUs, and the Cell/B.E.  The correlator is a streaming, real-time application, and is much more I/O intensive than applications that are typically implemented on many-core hardware today.  We compare with the LOFAR production correlator on an IBM Blue Gene/P supercomputer. We investigate performance, power efficiency, and programmability.  We identify several important architectural problems which cause architectures to perform suboptimally.  Our findings are applicable to data-intensive applications in general.<span id="more-1821"></span></p></blockquote>
<blockquote><p>The results show that the processing power and memory bandwidth of current GPUs are highly imbalanced for correlation purposes.  While the production correlator on the Blue Gene/P achieves a superb 96% of the theoretical peak performance, this is only 14% on ATI GPUs, and 26% on NVIDIA GPUs. The Cell/B.E. processor, in contrast, achieves an excellent 92%. We found that the Cell/B.E. is also the most energy-efficient solution, it runs the correlator 5-7 times more energy efficiently than the Blue Gene/P.  The research presented is an important pathfinder for next-generation telescopes.</p></blockquote>
<p>(Rob V. van Nieuwpoort and John W. Romein. &#8220;<a href="http://www.astron.nl/~nieuwpoort/papers/ics09-correlator.pdf" target="_blank">Using Many-Core Hardware to Correlate Radio Astronomy Signals</a>, Proceedings of the ACM International Conference on Supercomputing (ICS&#8217;09), pp. 440-449, June 8-12, 2009, Yorktown Heights, New York, USA.)</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/08/26/many-core-radio-astronomy/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors</title>
		<link>http://gpgpu.org/2009/08/23/sparse-matrix-supercomputing09</link>
		<comments>http://gpgpu.org/2009/08/23/sparse-matrix-supercomputing09#comments</comments>
		<pubDate>Mon, 24 Aug 2009 01:18:54 +0000</pubDate>
		<dc:creator>Mark Harris</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[NVIDIA CUDA]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Sparse Linear Systems]]></category>
		<category><![CDATA[Supercomputing]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1818</guid>
		<description><![CDATA[Abstract: Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained [...]]]></description>
			<content:encoded><![CDATA[<p>Abstract:</p>
<blockquote><p>Sparse matrix-vector multiplication (SpMV) is of singular importance in sparse linear algebra. In contrast to the uniform regularity of dense linear algebra, sparse operations encounter a broad spectrum of matrices ranging from the regular to the highly irregular. Harnessing the tremendous potential of throughput-oriented processors for sparse operations requires that we expose substantial fine-grained parallelism and impose sufficient regularity on execution paths and memory access patterns. We explore SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes. The techniques we propose are efficient, successfully utilizing large percentages of peak bandwidth. Furthermore, they deliver excellent total throughput, averaging 16 GFLOP/s and 10 GFLOP/s in double precision for structured grid and unstructured mesh matrices, respectively, on a GeForce GTX 285. This is roughly 2.8 times the throughput previously achieved on Cell BE and more than 10 times that of a quad-core Intel Clovertown system.</p></blockquote>
<p>(&#8220;<a href="http://www.nvidia.com/object/nvidia_research_pub_013.html" target="_blank">Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors</a>&#8220;. Nathan Bell and Michael Garland, in <em>&#8220;Proc. Supercomputing &#8217;09&#8243;</em>, August 2009.)</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/08/23/sparse-matrix-supercomputing09/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Path to Petascale: Adapting GEO/CHEM/ASTRO Applications for Accelerators and Accelerator Clusters</title>
		<link>http://gpgpu.org/2009/06/04/path-to-petascale</link>
		<comments>http://gpgpu.org/2009/06/04/path-to-petascale#comments</comments>
		<pubDate>Thu, 04 Jun 2009 23:23:19 +0000</pubDate>
		<dc:creator>dom</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Astronomy]]></category>
		<category><![CDATA[Astrophysics]]></category>
		<category><![CDATA[Clusters]]></category>
		<category><![CDATA[Computational Chemistry]]></category>
		<category><![CDATA[geosciences]]></category>
		<category><![CDATA[Supercomputing]]></category>
		<category><![CDATA[Workshops]]></category>

		<guid isPermaLink="false">http://gpgpu.org/?p=1642</guid>
		<description><![CDATA[The goal of this workshop, held at the National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, was to help computational scientists in the geosciences, computational chemistry, and astronomy and astrophysics communities take full advantage of emerging high-performance computing resources based on computational accelerators, such as clusters with GPUs and Cell processors. Slides are [...]]]></description>
			<content:encoded><![CDATA[<p>The goal of this workshop, held at the National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, was to help computational scientists in the geosciences, computational chemistry, and astronomy and astrophysics communities take full advantage of emerging high-performance computing resources based on computational accelerators, such as clusters with GPUs and Cell processors.</p>
<p>Slides are now<a href="http://www.ncsa.uiuc.edu/Conferences/accelerators/agenda.html" target="_blank"> available online</a> and cover a wide range of topics including</p>
<ul>
<li>GPU and Cell programming tutorials</li>
<li>GPU and Cell technology</li>
<li>Accelerator programming, clusters, frameworks and building blocks such as sparse matrix-vector products, tree-based algorithms and in particular accelerator integration into large-scale established code bases</li>
<li>Case studies and posters from geosciences, computational chemistry and astronomy/astrophysics such as the simulation of earthquakes, molecular dynamics, solar radiation, tsunamis, weather predictions, climate modeling and n-body systems as well as Monte-Carlo, Euler, Navier-Stokes and Lattice-Boltzmann type of simulations</li>
</ul>
<p>(National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign: <a href="http://www.ncsa.uiuc.edu/Conferences/accelerators/agenda.html" target="_blank">Path to Petascale workshop presentations</a>, organized by Wen-mei Hwu, Volodymyr Kindratenko, Robert Wilhelmson, Todd Martínez and Robert Brunner)</p>
]]></content:encoded>
			<wfw:commentRss>http://gpgpu.org/2009/06/04/path-to-petascale/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

