• Login
    View Item 
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Novel Reinforcement Learning Algorithms and Applications to Hybrid Control Design Problems

    View/Open
    Thesis full text (1.333Mb)
    Author
    Gandhi, Meet
    Metadata
    Show full item record
    Abstract
    The thesis is a compilation of two independent works. In the first work, we develop novel weight assignment procedure, which helps us develop several schedule based algorithms. Learning the value function of a given policy from the data samples is an important problem in Reinforcement Learning. TD(λ) is a popular class of algorithms to solve this problem. However, the weight assigned to different n-step returns decreases exponentially with increasing n in TD(λ). Here, we present a λ-schedule procedure that allows flexibility in weight assignment to the different n-step returns. Based on this procedure, we propose an on-policy algorithm, TD(λ)-schedule, and an off-policy algorithm, TDC(λ)-schedule, respectively. We provide proofs of almost sure convergence for both algo- rithms under a general Markov noise framework as well as present the results of experiments where these algorithms are seen to show improved performance. In the second work, we design hybrid control policies for hybrid systems whose mathemati- cal models are unknown. Our contributions are threefold here. First, we propose a framework for modelling the hybrid control design problem as a single Markov Decision Process (MDP). This result facilitates the application of off-the-shelf algorithms from Reinforcement Learning (RL) literature towards designing optimal control policies. Second, we model a set of bench- mark examples of hybrid control design problem in the proposed MDP framework. Third, we adapt the recently proposed Proximal Policy Optimisation (PPO) algorithm for the hybrid action space and apply it to the above set of problems. It is observed that in each case the algorithm converges and finds the optimal policy.
    URI
    https://etd.iisc.ac.in/handle/2005/5183
    Collections
    • Computer Science and Automation (CSA) [392]

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV