GPUs have recently evolved into very fast parallel coprocessors capable of executing general-purpose computations extremely efficiently. At the same time, multicore CPUs evolution continued and today’s CPUs have 4-8 cores. These two trends, however, have followed independent paths in the sense that we are aware of very few works that consider both devices cooperating to solve general computations. In this paper we investigate the coordinated use of CPU and GPU to improve efficiency of applications even further than using either device independently. We use Anthill runtime environment, a data-flow oriented framework in which applications are decomposed into a set of event-driven filters, where for each event, the runtime system can use either GPU or CPU for its processing. For evaluation, we use a histopathology application that uses image analysis techniques to classify tumor images for neuroblastoma prognosis. Our experimental environment includes dual and octa-core machines, augmented with GPUs and we evaluate our approach’s performance for standalone and distributed executions. Our experiments show that a pure GPU optimization of the application achieved a factor of 15 to 49 times improvement over the single-core CPU version, depending on the versions of the CPUs and GPUs. We also show that the execution can be further reduced by a factor of about 2 by using our runtime system that effectively choreographs the execution to run cooperatively both on GPU and on a single core of CPU. We improve on that by adding more cores, all of which were previously neglected or used ineffectively. In addition, the evaluation on a distributed environment has shown near linear scalability to multiple hosts.
(George Teodoro, Rafael Sachetto, Olcay Sertel, Metin Gurcan, Wagner Meira Jr., Umit Catalyurek, and Renato Ferreira. Coordinating the Use of GPU and CPU for Improving Performance of Compute Intensive Applications. IEEE Cluster 2009. New Orleans, LA, USA. Presentation. Paper.)