• Login
    View Item 
    •   etd@IISc
    • Division of Interdisciplinary Research
    • Department of Computational and Data Sciences (CDS)
    • View Item
    •   etd@IISc
    • Division of Interdisciplinary Research
    • Department of Computational and Data Sciences (CDS)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Towards Learning Adversarially Robust Deep Learning Models

    View/Open
    Thesis full text (1.781Mb)
    Author
    Vivek, B S
    Metadata
    Show full item record
    Abstract
    Deep learning models have shown impressive performance across a wide spectrum of computer vision applications, including medical diagnosis and autonomous driving. One of the major concerns that these models face is their susceptibility to adversarial samples: samples with small, crafted noise designed to manipulate the model’s prediction. A defense mechanism named Adversarial Training (AT) shows promising results against these attacks. This training regime augments mini-batches with adversaries. However, to scale this training to large networks and datasets, fast and simple methods (e.g., single-step methods such as Fast Gradient Sign Method (FGSM)), are essential for generating these adversaries. But, single-step adversarial training (e.g., FGSM adversarial training) converges to a degenerate minimum, where the model merely appears to be robust. As a result, models are vulnerable to simple black-box attacks. In this thesis, we explore the following aspects of adversarial training: Failure of Single-step Adversarial Training: In the first part of the thesis, we will demonstrate that the pseudo robustness of an adversarially trained model is due to the limitations in the existing evaluation procedure. Further, we introduce novel variants of white-box and black-box attacks, dubbed “gray-box adversarial attacks”, based on which we propose a novel evaluation method to assess the robustness of the learned models. A novel variant of adversarial training named “Gray-box Adversarial Training” that uses intermediate versions of the model to seed the adversaries is proposed to improve the model’s robustness. Regularizers for Single-step Adversarial Training: In this part of the thesis, we will discuss various regularizers that could help to learn robust models using single-step adversarial training methods. (i) Regularizer that enforces logits for FGSM and I-FGSM (iterative-FGSM) of a clean sample, to be similar (imposed on only one pair of an adversarial sample in a mini-batch), (ii) Regularizer that enforces logits for FGSM and R-FGSM (Random+FGSM) of a clean sample, to be similar, (iii) Monotonic loss constraint: Enforces the loss to increase monotonically with an increase in the perturbation size of the FGSM attack, and (iv) Dropout with decaying dropout probability: Introduces dropout layer with decaying dropout probability, after each nonlinear layer of a network. Incorporating Domain Knowledge to Improve Model’s Adversarial Robustness: In this final part of the thesis, we show that the existing normal training method fails to incorporate domain knowledge into the learned feature representation of the network. Further, we show that incorporating domain knowledge into the learned feature representation of the network results in a significant improvement in the robustness of the network against adversarial attacks, within normal training regime.
    URI
    https://etd.iisc.ac.in/handle/2005/4488
    Collections
    • Department of Computational and Data Sciences (CDS) [102]

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV