• Login
    View Item 
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A Study of Thompson Sampling Approach for the Sleeping Multi-Armed Bandit Problem

    View/Open
    G28478.pdf (2.053Mb)
    Date
    2018-05-29
    Author
    Chatterjee, Aritra
    Metadata
    Show full item record
    Abstract
    The multi-armed bandit (MAB) problem provides a convenient abstraction for many online decision problems arising in modern applications including Internet display advertising, crowdsourcing, online procurement, smart grids, etc. Several variants of the MAB problem have been proposed to extend the basic model to a variety of practical and general settings. The sleeping multi-armed bandit (SMAB) problem is one such variant where the set of available arms varies with time. This study is focused on analyzing the efficacy of the Thompson Sampling algorithm for solving the SMAB problem. Any algorithm for the classical MAB problem is expected to choose one of K available arms (actions) in each of T consecutive rounds. Each choice of an arm generates a stochastic reward from an unknown but fixed distribution. The goal of the algorithm is to maximize the expected sum of rewards over the T rounds (or equivalently minimize the expected total regret), relative to the best fixed action in hindsight. In many real-world settings, however, not all arms may be available in any given round. For example, in Internet display advertising, some advertisers might choose to stay away from the auction due to budget constraints; in crowdsourcing, some workers may not be available at a given time due to timezone difference, etc. Such situations give rise to the sleeping MAB abstraction. In the literature, several upper confidence bound (UCB)-based approaches have been proposed and investigated for the SMAB problem. Our contribution is to investigate the efficacy of a Thomp-son Sampling-based approach. Our key finding is to establish a logarithmic regret bound, which non-trivially generalizes a similar bound known for this approach in the classical MAB setting. Our bound also matches (up to constants) the best-known lower bound for the SMAB problem. Furthermore, we show via detailed simulations, that the Thompson Sampling approach in fact outperforms the known algorithms for the SMAB problem.
    URI
    https://etd.iisc.ac.in/handle/2005/3631
    Collections
    • Computer Science and Automation (CSA) [392]

    Related items

    Showing items related by title, author, creator and subject.

    • New Methods for Learning from Heterogeneous and Strategic Agents 

      Divya, Padmanabhan (2018-05-21)
      1 Introduction In this doctoral thesis, we address several representative problems that arise in the context of learning from multiple heterogeneous agents. These problems are relevant to many modern applications such as ...
    • Reinforcement Learning in Large and Structured Environments 

      Ray Chowdhury, Sayak
      In a reinforcement learning (RL) problem, a learner takes actions to control the state of an initially unknown environment so as to maximize the sum of the rewards he obtains. This has several applications in many practical ...
    • Algorithms for Social Good in Online Platforms with Guarantees on Honest Participation and Fairness 

      Ghalme, Ganesh Sambhaji
      Recent decades have seen a revolution in the way people communicate, buy products, learn new things, and share life experiences. This has spurred the growth of online platforms that enable users from all over the globe to ...

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV