Novel Neural Architecture for Multi-Hop Question Answering

Bhargav, G P Shrivatsa

View/Open

my_Thesis.pdf (808.1Kb)

Author

Bhargav, G P Shrivatsa

Metadata

Show full item record

Abstract

Natural language understanding has been one of the key drivers responsible for advancing the eld of AI. To this end, automated Question Answering (QA) has served as an effective way of measuring the language understanding capabilities of AI systems. Our focus in this thesis is on Reading Comprehension style Question Answering (RCQA) task. Reading comprehension is the ability to understand natural language text and answer questions over it. Speci cally, we focus on complex questions that require multi-hop reasoning over facts spread across multiple passages. Recently, there has been a surge in the research activities surrounding RCQA task, primarily due to the emergence of large-scale public datasets. For single-hop RCQA datasets, majority of the proposed solutions are based on massively pre-trained Transformer-style models such as BERT. Some of these solutions have exhibited human level performance. Similar solutions have been proposed for the multi-hop RCQA datasets also and they have also improved the state-of-the-art. However, we believe that the core challenges involved in the multi-hop RCQA task have not been addressed e ectively by existing solutions and hence there is an opportunity to advance the state-of-the-art. We present a novel deep neural architecture, called TAP (Translucent Answer Prediction), to identify answers and evidence (in the form of supporting facts) in an RCQA task requiring multihop reasoning. TAP comprises two loosely coupled networks { Local and Global Interaction eXtractor (LoGIX) and Answer Predictor (AP). LoGIX predicts supporting facts, whereas AP consumes these predicted supporting facts to predict the answer span. The design of LoGIX is inspired by two key design desiderata { local context and global interaction{ that we identi ed by analyzing examples of multi-hop RCQA task. The loose coupling between LoGIX and the AP reveals the set of sentences used by the AP in predicting an answer. Therefore, answer predictions of TAP can be interpreted in a translucent manner. We conduct extensive evaluations and analyses on the HotpotQA dataset to understand the characteristics of TAP. TAP achieved state-of-the-art accuracy on the distractor setting of the HotpotQA dataset.

URI

https://etd.iisc.ac.in/handle/2005/4421

Collections

Computer Science and Automation (CSA) [394]