Show simple item record

dc.contributor.advisorSubramani, Deepak N
dc.contributor.authorChowdhury, Rohit
dc.date.accessioned2024-08-07T04:36:58Z
dc.date.available2024-08-07T04:36:58Z
dc.date.submitted2024
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/6587
dc.description.abstractAutonomous marine agents play a pivotal role in diverse ocean applications. These agents serve as indispensable instruments for acquiring crucial environmental information. They are used to explore and monitor harsh environments, e.g., to map ocean topography, study coral reefs, search and rescue, structural monitoring of oil and gas installations, etc. In naval security, these agents are used for the surveillance and strategic monitoring of maritime activities. Consequently, the development of robust techniques that allow these agents to navigate uncertain environments optimally is essential to reduce operational costs. The path planning problem is as follows. An autonomous marine agent must optimally traverse from a given start location to a given target location in a stochastic dynamic velocity field, such as ocean currents, while avoiding obstacles or restricted regions in the flow. A key challenge is that the agent is heavily advected by the flow. The optimality may refer to minimizing expected travel time or energy consumption, data collection, or risk of failure. The goal of this thesis is to develop efficient, fast, and scalable algorithms for optimal planning and on-board routing for autonomous marine agents in these stochastic dynamic environments. The thesis comprises five works organized into two parts based on the solution approach. In the first part, we model the path planning problem as a Markov Decision Process (MDP) and compute an optimal policy. However, the key challenge here is that solving an MDP can be prohibitively expensive for large state and action spaces. To overcome this challenge, we either approximate the optimal policy or accelerate the computation using GPUs. Physics-driven Q-learning for onboard routing: We develop a physics-driven Q-learning method for approximating the optimal policy and update it in-mission for onboard routing. This involves learning an initial action-value function from a distribution of exact time-optimal paths predicted by Hamilton-Jacobi level set partial differential equations (HJLS PDEs). The flow data collected by onboard sensors is utilized to get a posterior estimate of the environment. The approximate initial optimal policy is refined in-mission by performing epsilon greedy Q-learning in simulated posterior environments. GPU-accelerated path planning: We introduce an efficient end-to-end GPU accelerated algorithm that builds the MDP model (computing transition probabilities and expected one-step rewards) and solves the MDP to compute an exact optimal policy. We develop methodical and algorithmic solutions to overcome the limited global memory of GPUs by using a dynamic reduced-order representation of the ocean flows, leveraging the sparse nature of the state transition probability matrix, and introducing a neighboring subgrid concept to save memory and reduce the computational effort. We achieve significant speedups compared to conventional sequential computation. Multi-objective GPU-accelerated path planning: We extend our GPU-accelerated plan- ner to a multi-objective path planner to solve multi-objective optimization problems in path planning, like minimizing both the expected mission completion time and energy consumption. MDPs are modeled with scalarized rewards for multiple objectives. The solver enables us to compute optimal operating curves in a fraction of the time compared to traditional solvers and analyze complex multi-objective missions. In the second part, we convert the optimal path planning problem into a supervised learning problem via sequence modeling. This approach involves learning optimal action sequences based on available environment information and expert trajectories. It also avoids the issue of re-computing optimal policies for onboard routing. Intelligent onboard routing using decision transformers: We develop a novel deep learning method based on the decision transformer (decoder-only model) for onboard routing of autonomous marine agents. Training data is obtained from aforementioned HJLS PDE or MDP solvers, which is further processed to sequences of states, actions, and returns. The model is autoregressively trained on these sequences and then tested in different environment settings. We demonstrate that (i) a trained agent learns to infer the surrounding flow and perform optimal onboard routing when the agent’s state estimation is accurate, and (ii) a trained agent is robust to limited noise in state transitions and is capable of reaching target locations in completely new flow scenarios. Path planning with environment encoders and action decoders: We propose a path plan- ning translation transformer model (encoder-decoder). We model the problem as a sequence- to-sequence translation task, where the source sequence is the agent’s knowledge representation of the uncertain environmental flow. The target sequence is the optimal sequence of actions the agent must execute. We demonstrate that a trained transformer model can predict near-optimal paths for unseen flow realizations and obstacle configurations in a fraction of the time required by traditional planners. Validation is performed to show generalization in unseen obstacle configurations. We also analyze the predictions of both transformer models, viz, decoder only and encoder-decoder, and explain the inner mechanics of learning through a novel visualization of self-attention of actions on the predicted trajectories.en_US
dc.description.sponsorshipPMRF; Department of Science and Technology for the INSPIRE Grant DST/INSPIRE/04/2018/003591en_US
dc.language.isoen_USen_US
dc.relation.ispartofseries;ET00597
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectPath Planningen_US
dc.subjectOnboard Routingen_US
dc.subjectSequence Modelingen_US
dc.subjectTransformersen_US
dc.subjectGPU-accelerated algorithmsen_US
dc.subjectAutonomous Marine Agentsen_US
dc.subjectStochastic Dynamic Environmentsen_US
dc.subjectMarkov Decision Processesen_US
dc.subjectQ-Learningen_US
dc.subjectDynamically Orthogonal Decompositionen_US
dc.subject.classificationResearch Subject Categories::INTERDISCIPLINARY RESEARCH AREASen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Information technologyen_US
dc.titleFast and Scalable Algorithms for Intelligent Routing of Autonomous Marine Agentsen_US
dc.typeThesisen_US
dc.degree.namePhDen_US
dc.degree.levelDoctoralen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record