Long-Running Multi-Component Climate Applications On Grids
Sundari, Sivagama M
MetadataShow full item record
Climate science or climatology is the scientific study of the earth’s climate, where climate is the term representing weather conditions averaged over a period of time. Climate models are mathematical models used to quantitatively describe, simulate and study the interactions among the components of the climate system -atmosphere, ocean, land and sea-ice. CCSM (Community Climate System Model) is a state-of-the-art climate model, and a long-running coupled multicomponent parallel application involving component models for simulating the components of the climate system. Each of the component models is a large-scale parallel application, and the parallel components exchange climate data through a specialized component called coupler. Typical multi-century climate simulations using CCSM take several weeks or months to execute on most parallel systems. In this thesis, we study the applicability of a computational grid for effective execution of long-running coupled multi-component climate applications like CCSM. Initial studies of the application characteristics led us to develop a dynamic component extension strategy for temporal inter-component load-balancing. By means of experiments on different parallel platforms with different number of processors, we showed that using our strategy can lead to about 15% reduction and savings of several days in execution times of CCSM for 1000-year simulation runs. Our initial studies also indicated that unlike typical grid applications, CCSM has limits on scalability to very large number of processors and hence cannot directly benefit from the large number of processors on a computational grid. However, its long-running nature and the limits of execution imposed on jobs on most multi-user batch queueing systems, led us to investigate the benefits of its execution on a grid of batch systems. The idea is that multiple batch queues can improve the processor availability rate with respect to the application thereby possibly improving its effective throughput. We explored this idea in detail with simulation studies involving various system and application characteristics, and execution models. By conducting large number of simulations with different workload characteristics and queuing policies of the systems, processor allocations to components of the application, distributions of the components to the batch systems and inter-cluster bandwidths, we showed that multiple batch executions lead to upto 55% average increase in throughput over single batch executions for long-running CCSM. Having convinced ourselves of possible advantages in performance, we then ventured to construct an application-level middleware framework. Our framework supports long duration execution of multi-component applications spanning multiple submissions to queues on multiple batch systems. It coordinates the distribution, execution, rescheduling, migration and restart of the application components across resources on different sites. It also addresses challenges including execution time limits for jobs, and differences in job-startup times corresponding to different components. Further, within the framework, we developed robust rescheduling policies that decide when and where to reschedule the components to the available resources based on the application execution characteristics and queue dynamics. Our grid middleware framework resulted in multi-site executions that provided larger application throughput than single-site executions, typically performed by climate scientists, and also removed the bottlenecks associated with a single system execution. We used this framework for long-running executions of CCSM to study the effect of increased black carbon aerosols and dust aerosols on the Indian monsoons. Black Carbon aerosols are essentially of anthropogenic origin and occur due to improper burning of fossil fuels, and dust is a naturally occurring aerosol. The concentrations of both these aerosols is high over the Indian region. We study the impact of these aerosols on precipitation and sea surface temperature (SST) through multi-decadal simulations conducted with our grid-enabled climate system model. Our observations indicated that increasing the concentrations of aerosols leads to an increase in precipitation in the central and eastern parts of India, and a decrease in SST over most of Indian ocean.