Prediction based load balancing heuristic for a heterogeneous cluster
Load balancing has been a topic of interest in both academia and industry, mainly because of the scope for performance enhancement that is available to be exploited in many parallel and distributed processing environments. Among the many approaches that have been used to solve the load balancing problem, we find that only very few use prediction of code execution times. Our reasoning for this is that the field of code prediction is in its infancy. As of this writing, we are not aware of any prediction-based load balancing approach that uses prediction8 of code-execution times, and uses neither the information provided by the user, nor an off-line step that does the prediction, the results of which are then used at run-time. In this context, it is important to note that prior studies have indicated the feasibility of predicting the CPU requirements of general application programs. Our motivation in using prediction-based load balancing is to determine the feasibility of the approach. The reasoning behind that is the following: if prediction-based load balancing does yield good performance, then it may be worthwhile to develop a predictor that can give a rough estimate of the length of the next CPU burst of each process. While high accuracy of the predictor is not essential, the computation overhead of the predictor must be sufficiently' small, so as not to offset the gain of load balancing. As for the system, we assume a set of autonomous computers, that are connected by a fast, shared medium. The individual nodes can vary in the additional hardware and software that may be available in them. Further, we assume that the processes in the workload are sequential. The first step is to fix the parameters for our assumed predictor. Then, an algorithm that takes into account the characteristics of the predictor is proposed. There are many trade-off decisions in the design of the algorithm, including certain steps in which we have relied on trial and error method to find suitable values. The next logical step is to verify the efficiency of the algorithm. To assess its performance, we carry out event driven simulation. We also evaluate the robustness of the algorithm with respect to the characteristics of the predictor. The contribution of the thesis is as follows: It proposes a load-balancing algorithm for a heterogeneous cluster of workstations connected by a fast network. The simulation assumes that the heterogeneity is limited to variability in processor clock rates; but the algorithm can be applied when the nodes have other types of heterogeneity as well. The algorithm uses prediction of CPU burst lengths as its basic input unit. The performance of the algorithm is evaluated through event driven simulation using assumed workload distributions. The results of the simulation show that the algorithm yields a good improvement in response times over the scenario in which no load redistribution is done.