Show simple item record

dc.contributor.advisorVadhiyar, Sathish
dc.contributor.authorRaghavan, Hari K
dc.date.accessioned2016-11-16T15:30:43Z
dc.date.accessioned2018-07-31T05:09:09Z
dc.date.available2016-11-16T15:30:43Z
dc.date.available2018-07-31T05:09:09Z
dc.date.issued2016-11-16
dc.date.submitted2012
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/2587
dc.identifier.abstracthttp://etd.iisc.ac.in/static/etd/abstracts/3352/G25422-Abs.pdfen_US
dc.description.abstractAdaptive Mesh Refinement (AMR) is a method which dynamically varies the spatio-temporal resolution of localized mesh regions in numerical simulations, based on the strength of the solution features. Due to high resolution discretization of localized regions of interests into rectangular mesh units called patches, AMR provides low cost of computations and high degree of accuracy. General purpose graphics processing units (GPGPUs) with their support for fine-grained parallelism, offer an attractive option for obtaining high performance for AMR applications. The data parallel computations of the finite difference schemes of AMR can be efficiently performed on GPGPUs. This research deals with challenges and develops techniques for efficient executions of AMR applications with uniform and non-uniform patches on GPUs. In the first part of the thesis, we optimize an AMR model with uniform patches. We have developed strategies for continuous online visualization of time evolving data for AMR applications executed on GPUs. In-situ visualization plays an important role for analyzing the time evolving characteristics of the domain structures. Continuous visualization of the output data for various time steps results in better study of the underlying domain and the model used for simulating the domain. We reorder the meshes for computations on the GPU based on the users input related to the subdomain that he wants to visualize. This makes the data available for visualization at a faster rate. We then perform asynchronous executions of the visualization steps and fix-up operations on the coarse meshes on the CPUs while the GPU advances the solution. By performing experiments on Tesla S1070 and Fermi C2070 clusters, we found that our strategies result in up to 60% improvement in response time and 16% improvement in the rate of visualization of frames over the existing strategy of performing fix-ups and visualization at the end of the time steps. The second part of the thesis deals with adaptive strategies for efficient execution of block structured AMR applications with non-uniform patches on GPUs. Most AMR approaches use patches of uniform sizes over regions of interests. Since this leads to over-refinement, some efforts have focused on forming patches of non-uniform dimensions to improve computational efficiency since the dimensions of a patch can be tuned to the geometry of a region of interest. While effective hybrid execution strategies exist for applications with uniform patches, our work considers efficient execution of non-uniform patches with different workloads. Our techniques include a geometric bin-packing method to load balance GPU computations and reduce thread idling, adaptive determination of amount of work to maximize asynchronism between CPU and GPU executions using a knapsack formulation, and scheduling communications for multi-GPU executions. We test our strategies for synthetic inputs as well as for traces from real applications. Our experiments on Tesla S1070 and Fermi C2070 clusters with both single-GPU and multi-GPU executions show that our strategies result in up to 69% improvement in performance over existing strategies. Our bin-packing based load balancing gives performance gains up to 39%, kernel optimizations give an improvement of up to 20%, and our strategies for adaptive asynchronism between CPU-GPU executions give performance improvements of up to 17% over default static asynchronous executions.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesG25422en_US
dc.subjectAdaptive Mesh Refinement (AMR)en_US
dc.subjectGraphical Processing Units (GPUs)en_US
dc.subjectGeneral Purpose Graphics Processing Units (GPGPUs)en_US
dc.subjectAdaptive Mesh Refinement Computationsen_US
dc.subjectGraphical Processing Unit (GPU)en_US
dc.subjectAMR Computationsen_US
dc.subjectGPU Systemen_US
dc.subject.classificationComputer Scienceen_US
dc.titleEfficient Execution Of AMR Computations On GPU Systemsen_US
dc.typeThesisen_US
dc.degree.nameMSc Enggen_US
dc.degree.levelMastersen_US
dc.degree.disciplineFaculty of Engineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record