Show simple item record

dc.contributor.advisorVenkatesh Babu, R
dc.contributor.authorSrinivas, Suraj
dc.date.accessioned2018-05-22T15:04:55Z
dc.date.accessioned2018-07-31T06:40:20Z
dc.date.available2018-05-22T15:04:55Z
dc.date.available2018-07-31T06:40:20Z
dc.date.issued2018-05-22
dc.date.submitted2017
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/3581
dc.identifier.abstracthttp://etd.iisc.ac.in/static/etd/abstracts/4449/G28168-Abs.pdfen_US
dc.description.abstractDeep neural networks with millions of parameters are at the heart of many state of the art computer vision models. However, recent works have shown that models with much smaller number of parameters can often perform just as well. A smaller model has the advantage of being faster to evaluate and easier to store - both of which are crucial for real-time and embedded applications. While prior work on compressing neural networks have looked at methods based on sparsity, quantization and factorization of neural network layers, we look at the alternate approach of pruning neurons. Training Neural Networks is often described as a kind of `black magic', as successful training requires setting the right hyper-parameter values (such as the number of neurons in a layer, depth of the network, etc ). It is often not clear what these values should be, and these decisions often end up being either ad-hoc or driven through extensive experimentation. It would be desirable to automatically set some of these hyper-parameters for the user so as to minimize trial-and-error. Combining this objective with our earlier preference for smaller models, we ask the following question - for a given task, is it possible to come up with small neural network architectures automatically? In this thesis, we propose methods to achieve the same. The work is divided into four parts. First, given a neural network, we look at the problem of identifying important and unimportant neurons. We look at this problem in a data-free setting, i.e; assuming that the data the neural network was trained on, is not available. We propose two rules for identifying wasteful neurons and show that these suffice in such a data-free setting. By removing neurons based on these rules, we are able to reduce model size without significantly affecting accuracy. Second, we propose an automated learning procedure to remove neurons during the process of training. We call this procedure ‘Architecture-Learning’, as this automatically discovers the optimal width and depth of neural networks. We empirically show that this procedure is preferable to trial-and-error based Bayesian Optimization procedures for selecting neural network architectures. Third, we connect ‘Architecture-Learning’ to a popular regularize called ‘Dropout’, and propose a novel regularized which we call ‘Generalized Dropout’. From a Bayesian viewpoint, this method corresponds to a hierarchical extension of the Dropout algorithm. Empirically, we observe that Generalized Dropout corresponds to a more flexible version of Dropout, and works in scenarios where Dropout fails. Finally, we apply our procedure for removing neurons to the problem of removing weights in a neural network, and achieve state-of-the-art results in scarifying neural networks.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesG28168en_US
dc.subjectDeep Neural Networksen_US
dc.subjectLearning Compact Architecturesen_US
dc.subjectMachine Learningen_US
dc.subjectBinary Neural Netsen_US
dc.subjectArchitecture Learningen_US
dc.subjectSparse Neural Networksen_US
dc.subjectBayesian Neural Networksen_US
dc.subjectNeural Network Architecturesen_US
dc.subject.classificationComputational and Data Sciencesen_US
dc.titleLearning Compact Architectures for Deep Neural Networksen_US
dc.typeThesisen_US
dc.degree.nameMSc Enggen_US
dc.degree.levelMastersen_US
dc.degree.disciplineFaculty of Engineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record