Show simple item record

dc.contributor.advisorGovindarajan, R
dc.contributor.authorPandit, Prasanna Vasant
dc.date.accessioned2018-05-01T06:49:24Z
dc.date.accessioned2018-07-31T05:09:20Z
dc.date.available2018-05-01T06:49:24Z
dc.date.available2018-07-31T05:09:20Z
dc.date.issued2018-05-01
dc.date.submitted2013
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/3468
dc.identifier.abstracthttp://etd.iisc.ac.in/static/etd/abstracts/4335/G25888-Abs.pdfen_US
dc.description.abstractComputing systems have become heterogeneous with the increasing prevalence of multi-core CPUs, Graphics Processing Units (GPU) and other accelerators in them. OpenCL has emerged as an attractive programming framework for heterogeneous systems. However, utilizing mul- tiple devices in OpenCL is a challenge as it requires the programmer to explicitly map data and computation to each device. Utilizing multiple devices simultaneously to speed up execu- tion of a kernel is even more complex, as the relative execution time of the kernel on different devices can vary significantly. Also, after each kernel execution, a coherent version of the data needs to be established. This means that, in order to utilize all devices effectively, the programmer has to spend considerable time and effort to distribute work across all devices, keep track of modified data in these devices and correctly perform a merging step to put the data together. Further, the relative performance of a program may vary across different inputs, which means a statically determined work distribution may not work well. In this work, we present FluidiCL, an OpenCL runtime that takes a program written for a single device and uses multiple heterogeneous devices to execute each kernel. The runtime performs dynamic work distribution and cooperatively executes each kernel on all available devices. Since we consider a setup with devices having discrete address spaces, our solution ensures that execution of OpenCL work-groups on devices is adjusted by taking into account the overheads for data management. The data transfers and data merging needed to ensure coherence are handled transparently without requiring any effort from the programmer. Flu- idiCL also does not require prior training or profiling and is completely portable across dif- ferent machines. Because it is dynamic, the runtime is able to adapt to system load. We have developed several optimizations for improving the performance of FluidiCL. We evaluate the runtime across different sets of devices. On a machine with an Intel quad-core processor and an NVidia Fermi GPU, FluidiCL shows a geomean speedup of nearly 64% over the GPU, 88% over the CPU and 14% over the best of the two devices in each benchmark. In all benchmarks, performance of our runtime comes to within 13% of the best of the two devices. FluidiCL shows similar results on a machine with a quad-core CPU and an NVidia Kepler GPU, with up to 26% speedup over the best of the two. We also present results considering an Intel Xeon Phi accelerator and a CPU and find that FluidiCL performs up to 45% faster than the best of the two devices. We extend FluidiCL from a CPU–GPU scenario to a three-device setup hav- ing a quad-core CPU, an NVidia Kepler GPU and an Intel Xeon Phi accelerator and find that FluidiCL obtains a geomean improvement of 6% in kernel execution time over the best of the three devices considered in each case.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesG25888en_US
dc.subjectHeterogeneous Computersen_US
dc.subjectOpen Computing Languageen_US
dc.subjectFluidiCLen_US
dc.subjectFluidic Kernelsen_US
dc.subjectOpenCL Application Programming Interfaceen_US
dc.subjectGraphics Processing Unit (GPU)en_US
dc.subjectCentral Processing Unit (CPU)en_US
dc.subjectComputer Architectureen_US
dc.subjectFluidiCL Runtimeen_US
dc.subjectHeterogeneous OpenCL Runtimeen_US
dc.subjectOpenCL Programsen_US
dc.subjectCPU–GPU Systemsen_US
dc.subject.classificationComputer Engineeringen_US
dc.titleCooperative Execution of Opencl Programs on Multiple Heterogeneous Devicesen_US
dc.typeThesisen_US
dc.degree.nameMSc Enggen_US
dc.degree.levelMastersen_US
dc.degree.disciplineFaculty of Engineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record