A Syntactic Neural Model For Question Decomposition

Gupta, Suman

View/Open

Thesis full text (1.718Mb)

Author

Gupta, Suman

Metadata

Show full item record

Abstract

Question decomposition along with single-hop Question Answering (QA) system serve as useful modules in developing multi-hop Question Answering systems, mainly because the resulting QA system is interpretable and has been demonstrated to exhibit better performance. The problem of Question Decomposition can be posed as a machine translation problem and it can be solved using any sequence-to-sequence neural architecture. Using this approach, it is difficult to capture the innate hierarchical structure of the decomposition. Inspired by database query languages a pseudo-formalism for capturing the meaning of questions, called Question Decomposition Meaning Representation (QDMR) was recently introduced. In this approach, a complex question is decomposed into simple queries which are mapped into a small set of formal operations. This method does not utilize the underlying syntax information of QDMR to generate the decomposition. In the area of programming language code generation, methods that use syntax information as a prior knowledge have been demonstrated to perform better. Moreover, the syntax-aware models are usually interpretable. Motivated by the success of syntax-aware models, we propose a new syntactic neural model for question decomposition in this thesis. In particular, we encode the underlying syntax of the QDMR structures into a grammar model as a sequence of actions. This is done using a deterministic framework which uses Abstract Syntax Trees (AST) and Parse Trees. The proposed approach can be thought of as an encoder-decoder method for QDMR structures where a sequence of possible actions is a latent representation of the QDMR structure. The advantage of using this latent representation is that it is interpretable. Experimental results on a real-world dataset demonstrate that the proposed approach outperforms the state-of-the-art approach especially in scenarios where training data is limited. Some heuristics to further improve the performance of the proposed approach are also suggested in this work.

URI

https://etd.iisc.ac.in/handle/2005/5529

Collections

Computer Science and Automation (CSA) [547]