Sequential Controlled Sensing to Detect an Anomalous Process

Narayanaprasad, Karthik Periyapattana

dc.contributor.advisor	Sundaresan, Rajesh
dc.contributor.author	Narayanaprasad, Karthik Periyapattana
dc.date.accessioned	2021-11-16T06:09:25Z
dc.date.available	2021-11-16T06:09:25Z
dc.date.submitted	2021
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/5514
dc.description.abstract	In this thesis, we study the problem of identifying an anomalous arm in a multi-armed bandit as quickly as possible, subject to an upper bound on the error probability. Also known as odd arm identification, this problem falls within the class of optimal stopping problems in decision theory and can be embedded within the framework of active sequential hypothesis testing. Prior works on odd arm identification dealt with independent and identically distributed observations from each arm. We provide the first known extension to the case of Markov observations from each arm. Our analysis and results are in the asymptotic regime of vanishing error probability. We associate with each arm an ergodic discrete time Markov process that evolves on a common, finite state space. The notion of anomaly is that the transition probability matrix (TPM) of the Markov process of one of the arms (the {\em odd arm}) is some $P_{1}$, and that of each non-odd arm is a different $P_{2}$, $P_{2}\neq P_{1}$. A learning agent whose goal it is to find the odd arm samples the arms sequentially, one at any given time, and observes the state of the Markov process of the sampled arm. The Markov processes of the unobserved arms may either remain frozen at their last observed states until their next sampling instant ({\em rested arms}) or continue to undergo state evolution ({\em restless arms}). The TPMs $P_{1}$ and $P_{2}$ may be known to the learning agent beforehand or unknown. We analyse the following cases: (a) rested arms with TPMs unknown, (b) restless arms with TPMs known, and (c) restless arms with TPMs unknown. For each of these cases, we first present an asymptotic lower bound on the {\color{black} growth rate of the }expected time required to find the odd arm, and subsequently devise a sequential arm selection policy which we show achieves the lower bound and is therefore asymptotically optimal. A key ingredient in our analysis of the setting of rested arms is the observation that for the Markov process of each arm, the long term fraction of entries into a state is equal to the long term fraction of exits from the state (global balance). When the arms are restless, it is necessary for the learning agent to keep track of the time since each arm was last sampled (arm's {\em delay}) and the state of each arm when it was last sampled (arm's {\em last observed state}). We show that the arm delays and the last observed states form a controlled Markov process which is ergodic under any stationary arm selection policy that picks each arm with a strictly positive probability. Our approach of considering the delays and the last observed states of all the arms jointly offers a global perspective of the arms and serves as a `lift' from the local perspective of dealing with the delay and the last observed state of each arm separately, one that is suggested by the prior works. Lastly, when the TPMs are unknown and have to be estimated along the way, it is important to ensure that the estimates converge almost surely to their true values asymptotically, i.e., the system is {\em identifiable}. We show identifiability follows from the ergodic theorem in the rested case, and provide sufficient conditions for it in the restless case.	en_US
dc.description.sponsorship	Department of Science and Technology, Ministry of Human Resource Development, Center for Networked Intelligence, Robert Bosch Center for Cyber-Physical Systems	en_US
dc.language.iso	en_US	en_US
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Multi-armed bandits	en_US
dc.subject	Sequential controlled sensing	en_US
dc.subject	Odd arm identification	en_US
dc.subject	Anomalous process	en_US
dc.subject	Markov chain	en_US
dc.subject	Controlled Markov chain	en_US
dc.subject	Markov decision problem	en_US
dc.subject	Information theory	en_US
dc.subject	Statistical learning theory	en_US
dc.subject	Detection and estimation theory	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electrical engineering	en_US
dc.title	Sequential Controlled Sensing to Detect an Anomalous Process	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: main.pdf
Size:: 1.296Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Electrical Communication Engineering (ECE) [402]

Show simple item record