Novel Neural Architectures based on Recurrent Connections and Symmetric Filters for Visual Processing
Abstract
Artificial Neural Networks (ANN) have been very successful due to their ability to extract meaningful information without any need for pre-processing raw data. First artificial neural networks were created in essence to understand how the human brain works. The expectations were that we would get a deeper understanding of the brain functions and human cognition, which we cannot explain just by biological experiments or intuitions. The field of ANN has grown so much now that the ANNs are not only limited for the purpose which they emerged for but are also being exploited for their unmatched pattern-matching and learning capabilities in addressing many complex problems, the problems which are difficult or impossible to solve by standard computational and statistical methods. The research has gone from ANN being used only for understanding brain functions to creating new types of ANN based on the neuronal pathways present in the brain. This thesis proposes two novel neural network layers based on studies on the human brain. First is a type of Recurrent Convolutional Neural Network layer called a Long-Short-Term-Convolutional-Neural-Network (LST_CNN) and the other is a Symmetric Convolutional Neural Network layer based on Symmetric Filters.
The current feedforward neural network models have been successful in visual processing. Due to this, the lateral and feedback processing has been under-explored. Existing visual processing networks (Convolutional Neural Networks) lack the recurrent neuronal dynamics which are present in ventral visual pathways of human and non-human primate brains. Ventral visual pathways contain similar densities of feedforward and feedback connections. Furthermore, the current convolutional models are limited in learning spatial information, but we should also focus on learning temporal visual information, considering that the world is dynamic in nature and not static. Thus motivating us to incorporate recurrence in the convolutional neural networks. The layer we propose (LST_CNN) is not just limited to spatial learning but is also capable of exploiting temporal knowledge from the data due to the implicit presence of recurrence in the structure. The capability of LST_CNN’s spatiotemporal learning is examined by testing it on Object Detection and Tracking. Due to the fact that LST_CNN is based on LSTM, we explicitly evaluate its spatial learning capabilities through experiments.
The visual cortex in the human brain has evolved to detect patterns and hence has specialized in detecting the pervasive symmetry in Nature. When filter weights from deep SOTA networks are visualized, several of them are symmetric similar to the features they represent. Hence inspiring the idea of constraining standard convolutional filter weights to symmetric weights. Given that the computational requirements for DNN training have doubled every few months, researchers have been trying to come up with NN architectural changes to combat this. In light of that, deploying symmetric filters reduces not only computational resources but also memory footprint. Therefore, using symmetric filters is beneficial for inference and also during training. Despite the reduction in trainable parameters, the accuracy is comparable to the standard version, thus allowing us to infer that they prevent over-fitting. We establish the quintessence of symmetric filters in NN models.