Single and Multi-player Stochastic Dynamic Optimization

Saha, Subhamay

View/Open

G25755.pdf (563.1Kb)

Date

2018-04-06

Author

Saha, Subhamay

Metadata

Show full item record

Abstract

In this thesis we investigate single and multi-player stochastic dynamic optimization prob-lems. We consider both discrete and continuous time processes. In the multi-player setup we investigate zero-sum games with both complete and partial information. We study partially observable stochastic games with average cost criterion and the state process be-ing discrete time controlled Markov chain. The idea involved in studying this problem is to replace the original unobservable state variable with a suitable completely observable state variable. We establish the existence of the value of the game and also obtain optimal strategies for both players. We also study a continuous time zero-sum stochastic game with complete observation. In this case the state is a pure jump Markov process. We investigate the nite horizon total cost criterion. We characterise the value function via appropriate Isaacs equations. This also yields optimal Markov strategies for both players. In the single player setup we investigate risk-sensitive control of continuous time Markov chains. We consider both nite and in nite horizon problems. For the nite horizon total cost problem and the in nite horizon discounted cost problem we characterise the value function as the unique solution of appropriate Hamilton Jacobi Bellman equations. We also derive optimal Markov controls in both the cases. For the in nite horizon average cost case we shown the existence of an optimal stationary control. we also give a value iteration scheme for computing the optimal control in the case of nite state and action spaces. Further we introduce a new class of stochastic processes which we call stochastic processes with \age-dependent transition rates". We give a rigorous construction of the process. We prove that under certain assunptions the process is Feller. We also compute the limiting probabilities for our process. We then study the controlled version of the above process. In this case we take the risk-neutral cost criterion. We solve the in nite horizon discounted cost problem and the average cost problem for this process. The crucial step in analysing these problems is to prove that the original control problem is equivalent to an appropriate semi-Markov decision problem. Then the value functions and optimal controls are characterised using this equivalence and the theory of semi-Markov decision processes (SMDP). The analysis of nite horizon problems becomes di erent from that of in nite horizon problems because of the fact that in this case the idea of converting into an equivalent SMDP does not seem to work. So we deal with the nite horizon total cost problem by showing that our problem is equivalent to another appropriately de ned discrete time Markov decision problem. This allows us to characterise the value function and to nd an optimal Markov control.

URI

https://etd.iisc.ac.in/handle/2005/3357

Collections

Mathematics (MA) [153]