Show simple item record

dc.contributor.advisorKolathaya, Shishir N Y
dc.contributor.authorRanjan, Abhishek
dc.date.accessioned2024-07-12T06:46:55Z
dc.date.available2024-07-12T06:46:55Z
dc.date.submitted2024
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/6558
dc.description.abstractReinforcement Learning (RL) has progressed from simple control tasks to complex real-world challenges with large state spaces. During initial iterations of training in most Reinforcement Learning (RL) algorithms, agents perform a significant number of random exploratory steps, which in the real world limits the practicality of these algorithms as this can lead to potentially dangerous behaviour. Hence, safe exploration is a critical issue when applying RL algorithms in the real world. Although RL excels in solving these challenging problems, the time required for convergence during training remains a significant limitation. Various techniques have been proposed to mitigate this issue, and reward shaping has emerged as a popular solution. However, most existing reward-shaping methods rely on value functions, which can pose scalability challenges as the environment’s complexity grows. Our research proposes a novel framework for reward shaping inspired by Barrier Functions, which is safety-oriented, intuitive, and easy to implement for any environment or task. To evaluate the effectiveness of our proposed reward formulations, we present our results on a challenging Safe Reinforcement Learning benchmark - the Open AI Safety Gym. We have conducted experiments on various environments, including CartPole, Half-Cheetah, Ant, and Humanoid. Our results demonstrate that our method leads to 1.4-2.8 times faster convergence and as low as 50-60% actuation effort compared to the vanilla reward. Moreover, our formulation has a theoretical basis for safety, which is crucial for real-world applications. In a sim-to-real experiment with the Go1 robot, we demonstrated better control and dynamics of the bot with our reward framework.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseries;ET00568
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectReinforcement Learningen_US
dc.subjectRoboticsen_US
dc.subjectBarrier Functionen_US
dc.subjectDeep Learningen_US
dc.subjectSafe RLen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Information technology::Computer scienceen_US
dc.titleBarrier Function Inspired Reward Shaping in Reinforcement Learningen_US
dc.typeThesisen_US
dc.degree.nameMTech (Res)en_US
dc.degree.levelMastersen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record