Algorithms For Stochastic Games And Service Systems

Prasad, H L

dc.contributor.advisor	Bhatnagar, Shalabh
dc.contributor.author	Prasad, H L
dc.date.accessioned	2014-04-23T06:02:50Z
dc.date.accessioned	2018-07-31T04:38:26Z
dc.date.available	2014-04-23T06:02:50Z
dc.date.available	2018-07-31T04:38:26Z
dc.date.issued	2014-04-23
dc.date.submitted	2012
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/2301
dc.identifier.abstract	https://etd.iisc.ac.in/static/etd/abstracts/2961/G25466-Abs.pdf	en_US
dc.description.abstract	This thesis is organized into two parts, one for my main area of research in the field of stochastic games, and the other for my contributions in the area of service systems. We first provide an abstract for my work in stochastic games. The field of stochastic games has been actively pursued over the last seven decades because of several of its important applications in oligopolistic economics. In the past, zero-sum stochastic games have been modelled and solved for Nash equilibria using the standard techniques of Markov decision processes. General-sum stochastic games on the contrary have posed difficulty as they cannot be reduced to Markov decision processes. Over the past few decades the quest for algorithms to compute Nash equilibria in general-sum stochastic games has intensified and several important algorithms such as stochastic tracing procedure [Herings and Peeters, 2004], NashQ [Hu and Wellman, 2003], FFQ [Littman, 2001], etc., and their generalised representations such as the optimization problem formulations for various reward structures [Filar and Vrieze, 1997] have been proposed. However, they suffer from either lack of generality or are intractable for even medium sized problems or both. In our venture towards algorithms for stochastic games, we start with a non-linear optimization problem and then design a simple gradient descent procedure for the same. Though this procedure gives the Nash equilibrium for a sample problem of terrain exploration, we observe that, in general, it need not be true. We characterize the necessary conditions and define KKT-N point. KKT-N points are those Karush-Kuhn-Tucker (KKT) points which corresponding to Nash equilibria. Thus, for a simple gradient based algorithm to guarantee convergence to Nash equilibrium, all KKT points of the optimization problem need to be KKT-N points, which restricts the applicability of such algorithms. We then take a step back and start looking at better characterization of those points of the optimization problem which correspond to Nash equilibria of the underlying game. As a result of this exploration, we derive two sets of necessary and sufficient conditions. The first set, KKT-SP conditions, is inspired from KKT conditions itself and is obtained by breaking down the main optimization problem into several sub-problems and then applying KKT conditions to each one of those sub-problems. The second set, SG-SP conditions, is a simplified set of conditions which characterize those Nash points more compactly. Using both KKT-SP and SG-SP conditions, we propose three algorithms, OFF-SGSP, ON-SGSP and DON-SGSP, respectively, which we show provide Nash equilibrium strategies for general-sum discounted stochastic games. Here OFF-SGSP is an off-line algorithm while ONSGSP and DON-SGSP are on-line algorithms. In particular, we believe that DON-SGSP is the first decentralized on-line algorithm for general-sum discounted stochastic games. We show that both our on-line algorithms are computationally efficient. In fact, we show that DON-SGSP is not only applicable for multi-agent scenarios but is also directly applicable for the single-agent case, i.e., MDPs (Markov Decision Processes). The second part of the thesis focuses on formulating and solving the problem of minimizing the labour-cost in service systems. We define the setting of service systems and then model the labour-cost problem as a constrained discrete parameter Markov-cost process. This Markov process is parametrized by the number of workers in various shifts and with various skill levels. With the number of workers as optimization variables, we provide a detailed formulation of a constrained optimization problem where the objective is the expected long-run averages of the single-stage labour-costs, and the main set of constraints are the expected long-run average of aggregate SLAs (Service Level Agreements). For this constrained optimization problem, we provide two stochastic optimization algorithms, SASOC-SF-N and SASOC-SF-C, which use smoothed functional approaches to estimate gradient and perform gradient descent in the aforementioned constrained optimization problem. SASOC-SF-N uses Gaussian distribution for smoothing while SASOC-SF-C uses Cauchy distribution for the same. SASOC-SF-C is the first Cauchy based smoothing algorithm which requires a fixed number (two) of simulations independent of the number of optimization variables. We show that these algorithms provide an order of magnitude better performance than existing industrial standard tool, OptQuest. We also show that SASOC-SF-C gives overall better performance.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G25466	en_US
dc.subject	Algorithms	en_US
dc.subject	Stochastic Games	en_US
dc.subject	Stochastic Games - Algorithms	en_US
dc.subject	Nash Equilibrium Computation	en_US
dc.subject	Gradient Descent Schemes	en_US
dc.subject	Markov Decision Processes	en_US
dc.subject	Service Systems - Labour Costs - Modelling	en_US
dc.subject	Labour Staffing Optimization - Algorithms	en_US
dc.subject	Markov Cost Process	en_US
dc.subject	Labour Costs - Modelling	en_US
dc.subject	Labor Cost Optimization	en_US
dc.subject	Nash Equilibria	en_US
dc.subject.classification	Game Theory	en_US
dc.title	Algorithms For Stochastic Games And Service Systems	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.discipline	Faculty of Engineering	en_US

Files in this item

Name:: G25466.pdf
Size:: 1.551Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [545]

Show simple item record