Neural networks for invariant recognition of 2-D objects
Abstract
This thesis studies the problem of translation-, rotation- and size-invariant recognition of two-dimensional objects. This problem can be framed as follows: Given (i) a set of object descriptions in terms of either global or local features of model objects, the descriptions specifying each object independently and in isolation, and (ii) an image of a scene, locate all the objects present in the image. The objects in the scene may be translated, rotated and scaled versions of the model objects and also may only be partially visible due to overlapping by other objects in the scene. Such invariant object recognition systems prove to be useful in industrial automation and in many military and biomedical applications.
The goal of the thesis is to design an invariant object recognition system which, when provided with the closed contours present in an image of a scene, identifies them as objects that are (possibly transformed) instances of the stored model objects. The emphasis of this study is on the application of neural networks to this problem. The thesis proposes a new neural network-based invariant object recognition system. A further contribution of the thesis is to highlight the importance of a principled approach to the design of neural networks for object recognition.
The thesis deals with two-dimensional (2-D) objects only. For the purpose of this thesis, 2-D objects are defined to be those three-dimensional objects whose shape information can be entirely provided by their bounding contours alone. Information regarding curvature of the surfaces of the objects is not required for their recognition. Naturally, all flat or almost flat objects fall into this category (e.g., industrial parts such as pliers, cutters and spanners; keys, etc.).
The main advantages of using neural networks for object recognition include massive parallelism and graceful performance degradation in presence of noise. Thus, neural networks can achieve fast recognition of objects even with a large collection of model objects, and can also provide robust performance.
This thesis proposes a two-staged network-based solution for invariant recognition of 2-D shapes. The first stage involves computation of an invariant representation of the shape. Since only 2-D shapes are considered in this thesis, a shape can be represented by a piecewise linear approximation of the object boundary. The angles at the vertices of the polygonal approximation of a planar shape remain invariant under rotation, translation and scaling, and the sequence of these internal angles therefore provides the required invariant representation of an object. The thesis presents new parallel algorithms for obtaining the polygonal approximation of a closed contour and for computing the internal angles of the polygonal approximation. The proposed algorithms are implementable on a simple neural network. The input to the stage is obtained by first applying an edge detection algorithm to the intensity image and then by rejecting spurious edges to retain only the closed contours; this processing requires existing standard techniques only. The proposed algorithms have been implemented and tested on synthetic and natural images.
The second stage of the network is concerned with matching the computed representation of the composite object in the image with object descriptions stored in a database. A parallel matching scheme based on a new locus-specific associative memory has been proposed. This system associatively recalls one of the models when provided with a noisy, corrupted or transformed version of a stored model as input. The amount of noise or corruption that can be tolerated by the memory is controlled by a modifiable vigilance parameter. Partial occlusion is also taken care of in this parallel matching scheme. The system can recognize all objects present in the input, one by one if sufficient portion (as controlled by the vigilance parameter) of each object is available at the input. Since shape matching is accomplished as an associative recall from the memory, the matching time is of constant order regardless of the size of the database of model objects. The matching algorithm proposed for this stage of the network has been implemented and tested on synthetic images.
The thesis is organized as follows: Chapter 1 is an introduction to the invariant shape recognition problem and neural networks. Chapter 2 presents a survey of traditional shape recognition techniques and current neural network-based approaches. The proposed model for shape recognition is then presented. Chapter 3 describes a parallel algorithm for corner detection on digital curves and also a procedure to determine the angles at the corner points in parallel. A survey of associative memories and a description of the parallel matching procedure based on locus-specific associative memories appear in Chapter 4. This chapter also discusses the implementation and the performance of the recognition system. Chapter 5 concludes the thesis by summarizing the salient features of the proposed model.

