dc.contributor.advisor | Katewa, Vaibhav | |
dc.contributor.author | Rahul, N R | |
dc.date.accessioned | 2025-02-08T05:19:39Z | |
dc.date.available | 2025-02-08T05:19:39Z | |
dc.date.submitted | 2024 | |
dc.identifier.uri | https://etd.iisc.ac.in/handle/2005/6803 | |
dc.description.abstract | We consider a sequential multi-task problem, where each task is modeled as a stochastic multi-armed bandit with K arms. We study the problem of transfer learning in this setting and propose algorithms based on UCB to transfer reward samples from previous tasks to improve the total regret across all tasks. We consider two different notions of similarity among tasks, (i) universal similarity and (ii) adjacent similarity. In universal similarity, all tasks encountered in the sequence are similar. On the other hand, in adjacent similarity, tasks close to one another in the sequence are more similar than the ones that are farther apart. We provide transfer algorithms and their regret upper bounds for both the similarity notions and then highlight the benefit of transfer. Our regret bounds show that the performance improves as the sequential tasks become closer to each other. Finally, we provide empirical results for our algorithms, which show
performance improvement over the standard UCB algorithm without transfer. | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartofseries | ;ET00813 | |
dc.rights | I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part
of this thesis or dissertation | en_US |
dc.subject | multi-armed bandit | en_US |
dc.subject | algorithms | en_US |
dc.subject | universal similarity | en_US |
dc.subject | adjacent similarity | en_US |
dc.subject | UCB algorithm | en_US |
dc.subject.classification | Research Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electronics | en_US |
dc.title | Sequential Transfer in Multi-Armed Bandits using Reward Samples | en_US |
dc.type | Thesis | en_US |
dc.degree.name | MTech (Res) | en_US |
dc.degree.level | Masters | en_US |
dc.degree.grantor | Indian Institute of Science | en_US |
dc.degree.discipline | Engineering | en_US |