Learning Non-linear Mappings from Data with Applications to Priority-based Clustering, Prediction, and Detection
Abstract
With the volume of data generated in today's internet-of-things, learning algorithms to extract and understand the underlying relations between the various attributes of data have gained momentum. This thesis is focused on learning algorithms to extract meaningful relations from the data using both unsupervised and supervised learning algorithms.
Vector quantization techniques are popularly used for applications in contextual data clustering, data visualization and high-dimensional data exploration. Existing vector quantization techniques, such as, the K-means and its variants and those derived from the self-organizing maps consider the input data vector as a whole without prioritizing over individual coordinates. Motivated by applications requiring priorities over data coordinates, we develop a theory for clustering data with different priorities over the coordinates called the data-dependent priority-based soft vector quantization. Based on the input data distribution, the priorities over the data coordinates are learnt by estimating the marginal distributions over each coordinate. The number of neurons approximating each coordinate based on the priority are determined through a reinforcement learning algorithm. Analysis on the convergence of the proposed algorithm and the probability of misclassification are presented along with simulation results on various data sets.
Self-organizing maps (SOM) are popularly used for applications in learning features, vector quantization, and recalling spatial input patterns. The adaptation rule in SOMs is based on the Euclidean distance between the input vector and the neuronal weight vector along with a neighborhood function that brings in topological arrangement of the neurons in the output space. It is capable of learning the spatial correlations among the data but fails to capture temporal correlations present in a sequence of inputs. We formulate a potential function based on a spatio-temporal metric and create hierarchical vector quantization feature maps by embedding memory structures similar to long short-term memories across the feature maps to learn the spatio-temporal correlations in the data across clusters.
Error correction codes such as low density parity check codes are popularly used to enhance the performance of digital communication systems. The current decoding framework relies on exchanging beliefs over a Tanner graph, which the encoder and decoder are aware of. However, this information may not be available readily, for example, in covert communication. The main idea is to build a neural network to learn the encoder mappings in the absence of knowledge of the Tanner graph. We propose a scheme to learn the mappings using the back-propagation algorithm. We investigate into the choice of different cost functions and the number of hidden neurons for learning the encoding function. The proposed scheme is capable of learning the parity check equations over a binary field towards identifying the validity of a codeword. Simulation results over synthetic data show that our algorithm is indeed capable of learning the encoder mappings. We also propose an approach to identify noisy codes using uncertainty estimation and to decode them using autoencoders.
In the next work, we consider the convolutional neural networks which are widely used in natural language processing, video analysis, and image recognition. However, the popularly used max-pooling layer discards most of the data, which is a drawback in applications, such as, prediction of video frames. We propose an adaptive prediction and classification network based on a data-dependent pooling architecture. We formulate a combined cost function for minimizing the prediction and classification errors. We also detect the presence of an unseen class during testing for digit prediction in videos.