Bimodal Projections Based Features for High Dimensional Pattern Classification

Deodhare, Dipti

dc.contributor.advisor	Vidhyasagar, M
dc.contributor.author	Deodhare, Dipti
dc.date.accessioned	2025-10-15T11:24:47Z
dc.date.available	2025-10-15T11:24:47Z
dc.date.submitted	2001
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/7202
dc.description.abstract	Classification tasks involving high-dimensional vectors are affected by the curse of dimensionality requiring large amount of training data. This is because a high dimensional space with a modest number of samples is mostly empty. To overcome this we employ the principle of Projection Pursuit The principle is motivated by the aim to search for clusters in high-dimensional space. It has been observed that for most high-dimensional “clouds” of data most low dimensional projections are normal. This means that projection directions that result in non-gaussian distributions of the projected data points are rare and hence interesting. Important information in the data is powerful in these directions and hence they can be used as effective feature extractors. To achieve this, data points are projected onto an appropriate projection direction. Search for clusters is in this single dimensional projection space. As a result inherent sparsity of the high-dimensional space is avoided. The principle concentrates on projections that allow discrimination between clusters and not faithful representation of the data. This is in contrast to methods like principal component analysis which tend to combine features that might have high correlation. Classical discriminant analysis methods also seek clusters but require class labels to be specified. One such technique, the Fisher’s Linear Discriminant (FLD) method, has been used to arrive at an algorithm that seeks bimodal projection directions. Although the FLD is a supervised classification technique our method of feature extraction is in fact unsupervised. An artificial two class problem is presented to the FLD by assigning labels randomly to the points in the data set. The FLD tries to align the classifier in such a way that the sets of classes are well separated. Thus, it tends to assist the detection of bimodal projection directions. The method is able to achieve good dimensionality reduction. Besides, the method naturally imposes a significant degree of weight sharing on the neural network performing the feature extraction. The objective of accurate encoding and the objective of generating discriminating representations to facilitate pattern classification are not necessarily consistent with each other. The feature vectors obtained using bimodal projection directions demonstrate good discriminating powers but are not designed to address the issue of encoding. Consequently, this feature map is appropriately extended so that the extended feature map defines an embedding in the feature space. This ensures that a smooth inverse map exists from the feature space back to the input space. As a result, the transformation is lossless. Here the dimension of the feature vector is larger than that of the input vector. However, the number of weight vectors that need to be stored to affect the feature extraction process is exactly the same as that would have been required for the feature map without the extension. The NIST data set for handwritten digits was used to discuss the applicability and efficiency of the bimodal projection directions based features to handle large, highdimensional data. The proposed feature extraction method constrains the number of free parameters (weights) in the network in various ways improving the generalisation performance of the resulting neural networks. This has been convincingly demonstrated by the performance of the algorithm on the NIST data set. The fact that the method is computationally efficient is demonstrated by the fact that although the resources available were minimal reasonable results could be obtained for a data set consisting of 3,44,307 training images and 58,646 test images each of size 32 x 32. The embedding based extension of the feature set is a unified approach to both data classification and data representation. Although it results in feature vectors that have dimensions larger than the input space it has been shown using the the popular wine and vowel benchmarks that very simple classifiers give good performance. So an increase in the complexity of the classification task due to the increase in the dimensionality of the input to the classifier is more than compensated by the fact that a simple classifier with few trainable parameters can achieve the desired classification performance.
dc.language.iso	en_US
dc.relation.ispartofseries	T05131
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation
dc.subject	Fisher's Linear Discriminant
dc.subject	Bimodal Projection Directions
dc.subject	Dimensionality Reduction
dc.title	Bimodal Projections Based Features for High Dimensional Pattern Classification
dc.type	Thesis
dc.degree.name	PhD
dc.degree.level	Doctoral
dc.degree.grantor	Indian Institute of Science
dc.degree.discipline	Engineering

Files in this item

Name:: T05131.pdf
Size:: 65.11Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [531]

Show simple item record