You are here: Home » Archives for Compilers
January 26th, 2012
Today NVIDIA released CUDA 4.1, including a new CUDA Toolkit, SDK, Visual Profiler, Parallel Nsight IDE and NVIDIA device driver.
CUDA 4.1 makes it easier to accelerate scientific research with GPUs with key features including
- a redesigned Visual Profiler with automated performance analysis and expert guidance;
- a new LLVM-based compiler that generates up to 10% faster code; and
- 1000+ new imaging and signal processing functions in the NPP library.
The CuSparse library included with CUDA 4.1 has a new tridiagonal solver and 2x faster sparse matrix-vector multiplication using the ELL hybrid format, and the CuRand library included with CUDA 4.1 has two new random number generators. Read the rest of this entry »
Posted in Developer Resources | Tags: Compilers, Debugging, NVIDIA CUDA, Profiling, Programming Languages | Write a comment
January 16th, 2012
CLCC, the light-weight and flexible utility for integrating OpenCL source builds into your project has just been updated to version 0.3.0. This version allows developers to save compiled binaries as object files for distribution with their programs and adds a series of options to select specific target platform/device combinations. Documentation and further information is available at http://clcc.sourceforge.net.
Posted in Developer Resources | Tags: Compilers, Open Source, OpenCL | Write a comment
December 7th, 2011
A major new release of the Intel SPMD Program Compiler (ispc) was posted on December 5, 2011. ispc is an extended version of the C programming language with support for “single program, multiple data” (SPMD) programming on the CPU; the SPMD model makes it easy to harness the full power of both the SIMD vector units and multiple cores on modern CPUs. The major features added in the 1.1 release include:
- Full support for pointers, including pointer arithmetic, function pointers, and all other features of pointers in C.
- A new parallel “foreach” statement, for more easily mapping computation to data.
- Substantially revised documentation, including a new Performance Guide.
- Many other small bug fixes and improvements.
ispc is open-source and is licensed under the BSD license. Source and binaries are available from http://ispc.github.com.
Posted in Developer Resources | Tags: Compilers, Intel, Open Source, SPMD | 2 Comments
June 26th, 2011
Microsoft has announced that the next version of Visual Studio will contain technology labeled C++ Accelerated Massive Parallelism (C++ AMP) to enable C++ developers to take advantage of the GPU for computation purposes. More information is available in the MSDN blog posts here and here.
Posted in Business, Developer Resources | Tags: Compilers, Microsoft, Programming Languages | 1 Comment
June 26th, 2011
Intel has announced ispc, The Intel SPMD Program Compiler, now available in source and binary form from http://ispc.github.com.
ispc is a new compiler for “single program, multiple data” (SPMD) programs; the same model that is used for (GP)GPU programming, but here targeted to CPUs. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs; it frequently provides a a 3x or more speedup on CPUs with 4-wide SSE units, without any of the difficulty of writing intrinsics code. There were a few principles and goals behind the design of ispc:
- To build a small C-like language that would deliver excellent performance to performance-oriented programmers who want to run SPMD programs on the CPU.
- To provide a thin abstraction layer between the programmer and the hardware—in particular, to have an execution and data model where the programmer can cleanly reason about the mapping of their source program to compiled assembly language and the underlying hardware.
- To make it possible to harness the computational power of the SIMD vector units without the extremely low-programmer-productivity activity of directly writing intrinsics.
- To explore opportunities from close coupling between C/C++ application code and SPMD ispc code running on the same processor—to have lightweight function calls between the two languages, to share data directly via pointers without copying or reformatting, and so forth.
ispc is an open source compiler with a BSD license. It uses the LLVM Compiler Infrastructure for back-end code generation and optimization and is hosted on github. It supports Windows, Mac, and Linux, with both x86 and x86-64 targets. It currently supports the SSE2 and SSE4 instruction sets, though support for AVX should be available soon.
Posted in Developer Resources | Tags: Compilers, Intel, Open Source, Programming Languages, SPMD | Write a comment
July 29th, 2010
Abstract:
Ocelot is a dynamic compilation framework designed to map the explicitly data parallel execution model used by NVIDIA CUDA applications onto diverse multithreaded platforms. Ocelot includes a dynamic binary translator from Parallel Thread eXecution ISA (PTX) to many-core processors that leverages the Low Level Virtual Machine (LLVM) code generator to target x86 and other ISAs. The dynamic compiler is able to execute existing CUDA binaries without recompilation from source and supports switching between execution on an NVIDIA GPU and a many-core CPU at runtime. It has been validated against over 130 applications taken from the CUDA SDK, the UIUC Parboil benchmark, the Virginia Rodinia benchmarks, the GPU-VSIPL signal and image processing library, the Thrust library, and several domain specific applications.
This paper presents a high level overview of the implementation of the Ocelot dynamic compiler highlighting design decisions and trade-offs, and showcasing their effect on application performance. Several novel code transformations are explored that are applicable only when compiling explicitly parallel applications and traditional dynamic compiler optimizations are revisited for this new class of applications. This study is expected to inform the design of compilation tools for explicitly parallel programming models (such as OpenCL) as well as future CPU and GPU architectures.
This paper identifies several key areas of research and open problems for optimizing the performance of data parallel programs (such as CUDA and OpenCL) that were encountered when designing a binary translator from PTX to LLVM/x86. The complete implementation of Ocelot is available open-source under the new BSD license at http://code.google.com/p/gpuocelot. Ongoing work involves translating PTX to AMD’s IL allowing CUDA programs to be executed on AMD GPUs, developing parallel-aware PTX to PTX optimizations, and exploring new programming and execution models that are layered on PTX.
(Gregory Diamos, Andrew Kerr, Sudhakar Yalamanchili and Nathan Clark: “Ocelot: A dynamic compiler for bulk-synchroneous applications in heterogeneous systems”. 19 International Conference on Parallel Architectures and Compilation Techniques (PACT2010), September 2010).
Posted in Developer Resources, Research | Tags: Compilers, Heterogeneneous Computing, NVIDIA CUDA, Ocelot, Papers | 1 Comment
June 6th, 2010
CAPS has recently added an OpenCL code generator to the just released 2.3 version of its HMPP directive-based hybrid compiler. Also, the CUDA back-end generator has been enhanced with Fermi capabilities and this new release brings support for more native compilers with Intel ifort/icc, GNU gcc/gfortran and PGI pgcc/pgfort compilers, enabling developers to freely use their favorite compiler with HMPP 2.3.
Based on GPU programming and tuning directives, HMPP offers an incremental programming model that allows developers with different levels of expertise to fully exploit GPU hardware accelerators in their legacy code. Read the rest of this entry »
Posted in Business, Developer Resources | Tags: Compilers, Programming Environments | Write a comment
April 12th, 2010
Abstract:
A new compilation framework enables the execution of numerical-intensive applications, written in Python, on a hybrid execution environment formed by a CPU and a GPU. This compiler automatically computes the set of memory locations that need to be transferred to the GPU, and produces the correct mapping between the CPU and the GPU address spaces. Thus, the programming model implements a virtual shared address space. This framework is implemented as a combination of unPython, an ahead-of-time compiler from Python/NumPy to the C++ programming language, and jit4GPU, a just-in-time compiler to the AMD CAL interface using CAL pixel shaders. Jit4GPU includes an optimizer that performs several loop transformations and reduces the number of texture instructions. Experimental evaluation was done on a Radeon 4850 and demonstrates that for some benchmarks the generated GPU code is 50 times faster than generated OpenMP code. The GPU performance also compares favorably with optimized CPU BLAS code for single-precision computations in most cases. Code transformations performed by Jit4GPU on GPU code were also shown to produce considerable speedup compared to unoptimized GPU code.
(Rahul Garg and José Nelson Amaral: “Compiling Python to a Hybrid Execution Environment”. Third Workshop on General-Purpose Computation on Graphics Processing Units, held in conjunction with ASPLOS XV, Pittsburgh, PA, March, 2010. [DOI])
Posted in Research | Tags: AMD CAL, ATI Stream, Compilers, Papers, Python | Write a comment
November 24th, 2009
The Portland Group has announced the general availability of its CUDA Fortran compiler for x64 and x86 processor-based systems running Linux, Mac OS X and Windows, including a 15-day trial license. From the press release:
Developed in collaboration with NVIDIA Corporation (Nasdaq: NVDA), the inventor of the graphics processing unit (GPU), PGI Release 2010 includes the first Fortran compiler compatible with the NVIDIA line of CUDA-enabled GPUs. A compiler is a software tool that translates applications from the high-level programming languages in which they are written by software developers into a binary form a computer can execute.
With developers taking advantage of the hundreds of cores and the relatively low cost of NVIDIA GPUs, programming to take advantage of the CUDA C compiler has become a popular means for accelerating the solution of complex computing problems. The PGI CUDA Fortran compiler is expected to accelerate GPU adoption even further in the High-Performance Computing (HPC) industry, where many important applications are written in Fortran. HPC is the field of technical computing engaged in the modeling and simulation of complex processes, such as ocean modeling, weather forecasting, environmental modeling, seismic analysis, bioinformatics and other areas.
The CUDA Fortran compiler is compatible with all NVIDIA GPUs that include Compute Capability 1.3 or higher, which includes most NVIDIA Quadro Professional Graphics solutions and all NVIDIA Tesla GPU Computing solutions. Developers are invited to download the PGI CUDA Fortran compiler from The Portland Group website at www.pgroup.com/support/downloads.php.
A 15-day trial license is available at no charge. In an effort to simplify adoption, NVIDIA has granted PGI rights to redistribute the relevant components of the CUDA Software Development Kit (SDK) as part of the PGI CUDA Fortran installation package.
Posted in Developer Resources | Tags: Compilers, Fortran, NVIDIA CUDA | Write a comment
September 29th, 2009
A public beta release of the CUDA-enabled Fortran Compiler from PGI enables programmers to write code in Fortran for NVIDIA CUDA GPUs. From a press release:
What: NVIDIA today announced that a public beta release of the PGI® CUDA-enabled Fortran compiler is now available. Developed in collaboration with The Portland Group® , it is the first Fortran compiler compatible with NVIDIA® CUDA™ -enabled graphics processing units (GPUs).
A compiler is a software tool that translates applications from the high-level programming languages used by software developers into a binary form a computer can execute.
Why: GPU computing with the CUDA C-compiler has gained significant momentum in the High-Performance Computing (HPC) space as it enables developers to get transformative increases in performance with minimal coding required.
Fortran is particularly well suited to numeric computation and scientific computing and remains widely used in a wide range of applications such as weather modeling, computational fluid dynamics and seismic processing.
Where can I get it?: Read the rest of this entry »
Posted in Developer Resources, Press | Tags: Compilers, Fortran, NVIDIA CUDA, Programming Languages | 2 Comments