Single and Multi-Agent Finite Horizon Reinforcement Learning Algorithms for Smart Grids

Vivek, V P

dc.contributor.advisor	Bhatnagar, Shalabh
dc.contributor.author	Vivek, V P
dc.date.accessioned	2024-10-07T04:43:11Z
dc.date.available	2024-10-07T04:43:11Z
dc.date.submitted	2024
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/6644
dc.description.abstract	In this thesis, we study sequential decision-making under uncertainty in the context of smart grids using reinforcement learning. The underlying mathematical model for reinforcement learning algorithms are Markov Decision Processes. A smart grid is essentially a concept for efficient electricity management using various technologies. We consider different models of smart grids involving single and multiple decision-making agents. We then develop reinforcement learning algorithms that can be applied to these models for efficient energy management. We also rigorously prove the convergence and stability of these algorithms. We then demonstrate the efficiency of these algorithms on the smart grid models we considered. Additionally, we run these algorithms on different randomly generated Markov Decision Processes to establish their correctness and convergence. We give a brief description of various studies given in this thesis. 1. Finite Horizon Q-learning algorithm for smart grids In this study, we develop a model of smart grid including different components like a main grid, microgrid with battery, renewal energy, and microcontroller. Subsequently, we define the problem of energy management in this model. This is modeled as a finite horizon Markov decision process. To address the complex decision-making process for energy management in the finite horizon Markov Decision Process, we develop a Q-learning algorithm. We apply this algorithm to our model effectively and demonstrate its performance. Additionally, we give rigorous mathematical proof establishing the stability and correctness of the algorithm. Our analysis of stability and convergence is purely based on ordinary differential equations. We also demonstrate the performance of our algorithm on different Markov Decision Processes generated randomly. 2. Finite Horizon Minimax Q-learning algorithm for smart grids. In this work, we developed a comprehensive model for smart grid that takes into account the the competition between two microgrids. Each microgrid have a battery, renewal energy, and microcontroller. Stochastic games are an important framework to capture the competitive environment. It is an extension of the Markov Decision Process by including multiple decision makers. It can also be viewed as an extension of games including a state factor. We model the competition between the two microgrids in our smart grid model as a finite horizon stochastic game. The interaction between the microgrids happens over a finite number of stages. We aim to solve the equilibrium of this competitive interaction. To this interest, the minimax concept is used to capture instantaneous interaction. Subsequently, we develop a finite horizon minimax Q-learning algorithm to capture the long-term equilibrium of the competition between two microgrids. The performance of the algorithm is effectively demonstrated on smart grid setup. Additionally, we demonstrate the correctness and convergence of the algorithm on randomly generated stochastic games. Furthermore, a rigorous mathematical proof of the stability and convergence of the algorithm is given. 3. Finite Horizon SOR Q-learning In this final part of our study, we proposed a generalization of the finite horizon problem using discounting and proposed an improvement of the finite horizon Q-learning algorithm for this problem. The rate of convergence of a reinforcement learning algorithm is an important parameter of its performance. There are techniques used in the literature to improve the rate of convergence of reinforcement learning algorithms. One of them is successive over-relaxation. This was originally used in linear algebra to improve the performance of the Gauss-Siedel iterative scheme used for solving linear system of equations. We apply this technique in the finite horizon Q-learning of discounted problems to get a better algorithm that has better asymptotic performance.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	;ET00654
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Reinforcement Learning	en_US
dc.subject	smart grids	en_US
dc.subject	Markov Decision Processes	en_US
dc.subject	Q-learning algorithm	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Information technology::Computer science	en_US
dc.title	Single and Multi-Agent Finite Horizon Reinforcement Learning Algorithms for Smart Grids	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: Vivek- VP-Phd-2024.pdf
Size:: 3.936Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [394]

Show simple item record