• Login
    View Item 
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Neural Approaches for Natural Language Query Answering over Source Code

    View/Open
    Thesis full text (592.9Kb)
    Author
    Mandal, Madhurima
    Metadata
    Show full item record
    Abstract
    During software development, developers need to ensure that the developed code is bug-free and the best coding practices are followed during the code development process. To guarantee this, the developers require answers to queries about specific aspects of the code relevant to the development. Powerful code-query languages such as CodeQL have been developed for this purpose. Use of such code-query languages, however, requires expertise in writing formal queries. For each separate query, one needs to write several lines in a code-query language. To remedy these problems, we propose to represent each query by a natural language phrase and answer such queries using neural networks. We aim to perform model training such that a single model can answer multiple queries as opposed to writing separate formal queries for each task. Such a model can answer these queries against unseen code. With this motivation, we introduce the novel NlCodeQA dataset. It includes 171,346 labeled examples where each input consists of a natural language query and a code snippet. The labels are answer spans in the input code snippet with respect to the input query. State-of-the-art BERT-style neural architectures were trained using the NlCodeQA data. Preliminary experimental results show that the proposed model achieves the exact match accuracy of 86.30%. The proposed use of natural language query and neural models for query understanding will help increase the productivity of software developers and pave the way for designing machine learning based code analysis tools that can complement the existing code analysis systems for complex code queries that are either hard or expensive to represent using a formal query language.
    URI
    https://etd.iisc.ac.in/handle/2005/5834
    Collections
    • Computer Science and Automation (CSA) [392]

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV