Enhanced neural networks architectures for pattern classification

Shanmukh, K.

View/Open

T03942.pdf (42.97Mb)

Author

Shanmukh, K.

Metadata

Show full item record

Abstract

The thesis deals with the problem of pattern classification for which new algorithms using three different neural network models have been proposed: (i) Recursive (ii) Feedforward; and (iii) Self-Organizing Networks. These models are ‘enhanced’ to provide a more efficient (in terms of speed, robustness and size of the network) solution to the classification problem. As far as the first neural network model (namely, recursive) is concerned, it is well known that it achieves pattern classification through a process called ‘pattern association’. In order to improve the performance of recursive networks for pattern association, we have considered the following problems in their design. — Given a set of patterns in the form of bipolar vectors, how to design a recursive network that acts as an associative memory for these patterns, with basins of attraction as large as possible? — When all the given patterns cannot be accommodated in the network, how to design the network such that maximal number of patterns are stored in the network with basins of attraction as large as possible? We propose two methods to solve them: Development of optimal learning algorithms by showing the equivalence between the learning in recursive networks and that in the (2-state) Perceptron. With this equivalence, optimal learning algorithms for the Perceptron can be directly applied to the problem of optimal learning in the recursive neural network models. In addition to the existing learning algorithms for the perceptron, we propose two other algorithms for optimal learning in perceptron, based on Thermal Perceptron Learning Rule (TPLR) [1], and compare them, as applied to the problem of recursive network design. Modification of the dynamics of the network by using a finite number of the previous state values at each node. It has been shown, empirically, that the new network has larger regions of attraction around the equilibrium points. Convergence conditions for the network have also been established. The second model of networks (namely, feedforward) is the most widely used network architecture for pattern classification. The input to these networks is in the form of real-valued feature vectors. In the design of such networks, the basic issue is to determine the appropriate network size for a given problem: the network size should be sufficiently large to be able to realize the required input-output mapping, but should be small enough to give good generalization results. In the literature, constructive learning algorithms offer a solution to this problem. These algorithms, instead of training a network of fixed 0, start with a network of a small size, and add neurons as and when required. We develop new constructive learning algorithms for the construction of feedforward neural networks whose size is near-optimal: Modified Upstart and Tower algorithms which use the TPLR for training the hidden neurons; and A modification of the TPLR, called Biased TPLR (BTPLR) for training the hidden neurons in the design of the networks using the paradigm of Sequential Learning [2]. An advantage of the latter (namely, BTPLR) is that it can be applied to the construction of networks where the hidden neurons have activation functions other than the threshold functions, such as the window and cluster activation functions. For window neurons, the BTPLR gives results marginally superior to those of [3]. We demonstrate, by simulation studies, the superiority of the proposed algorithms (in terms of network complexity, conceptual simplicity, and ease in implementation) over similar algorithms of the literature. Amongst the algorithms for networks with threshold neurons, the Sequential Learning Algorithm (SLA) with BTPLR gives networks with the smallest number of hidden neurons, and is faster than the SLA of [2]. The networks generated by the Tower Algorithm also give a performance comparable to the algorithms based on SLA, except when applied to the problem of random Boolean mappings. After some minor modifications, all the proposed algorithms can be used for problems with real-valued data and with multiple classes also. Finally, based on the third model (namely, self-organization), a novel method has been proposed for the classification of patterns given in the form of 2-D binary images. For each of the exemplar patterns, a network of neurons is constructed, with its neurons arranged in exactly the same way as the corresponding exemplar. Each network is mapped onto the given test pattern using a self-organization scheme. Some measures of mapping are proposed based on which classification is carried out. The technique has been applied to the problem of object recognition. The results obtained show its efficacy and robustness. The thesis is organized as follows: Chapter 1 gives an introduction to the three neural network models discussed in the thesis. Techniques for improving the performance of the recursive networks for pattern association are discussed in Chapter 2. In Chapter 3, we propose new constructive learning algorithms for the design of feedforward neural networks for pattern classification. A new self-organizing architecture for pattern recognition is proposed in Chapter 4. The thesis ends with conclusions in Chapter 5.

URI

https://etd.iisc.ac.in/handle/2005/7191

Collections

Electrical Engineering (EE) [412]