Hybrid evolutionary algorithms for supervised and unsupervised learning
Abstract
The main focus of this work is the development of hybrid evolutionary algorithms. Evolutionary Algorithms (EAs) are stochastic optimization algorithms inspired by natural evolution. EAs include Genetic Algorithms (GAs), Evolution Strategies (ES), and Evolutionary Programming (EP). EAs are capable of finding global optima due to their stochastic nature; however, they are inefficient for most practical problems because they are relatively slow.
The hybrid evolutionary algorithms developed here utilize operators derived from traditional locally convergent algorithms to increase speed while retaining global convergence properties. Algorithms for both supervised and unsupervised learning are developed and their convergence analyzed.
Unsupervised Learning
The problem of partitioning a given set of data points into a specified number of clusters such that the Sum?of?Within?Cluster Variation is minimized forms an important component of unsupervised learning. This problem is addressed here.
Typically, K?Means algorithms used to solve this problem converge to a local optimum. Earlier approaches based on EAs and other stochastic algorithms do converge to the global optimum but are highly computation?intensive.
A hybrid genetic algorithm, named the Genetic K?Means Algorithm (GKA), has been proposed. GKA uses one step of K?Means as its main operator in place of the crossover operator. A mutation operator, called Distance?Based Mutation, is introduced and used in GKA. GKA has been proved to converge to the global optimum with probability 1. Its performance is compared with various evolutionary clustering algorithms and is shown to be faster.
One significant application of the above clustering problem is in generating codebooks for Vector Quantization (VQ), widely used in digital signal compression. The performance of GKA has been compared with the Linde–Buzo–Gray (LBG) algorithm with respect to reconstructed signal quality. A major advantage of GKA is its convergence to the global optimum.
Two image compression techniques—block coding and discrete wavelet transform (DWT) coding—have been considered. A database of facial images is used to derive and test the codebooks obtained by GKA and LBG. Simulation studies show that the percentage improvement in Mean Square Error (MSE) is as high as 23% when GKA?derived codebooks are used instead of LBG?derived ones.
In speech coding, several parameter sets require quantization. One such important set is excitation gains. A novel VQ method for excitation gains is proposed, using the error in reconstructed excitation rather than the Euclidean distance as the error measure. Both Generalized Lloyd’s Algorithm (GLA) and a corresponding GKA version have been developed. Results show that the performance of the resulting codec—based on Algebraic Code?Excited Linear Prediction (ACELP)—is comparable to the G.729 standard (8 kbps), which is based on the sophisticated Conjugate Structure ACELP.
Supervised Learning
Pattern classification belongs to supervised learning and is considered in this thesis. The objective is to learn, from a labeled sample set, a model capable of predicting class labels for unseen samples.
The Nearest Neighbor Classifier (NNC) is a widely used model. Although it has no training phase, it requires:
Large memory to store training samples
High computation time to compute distances for each test sample
Several approaches attempt to reduce these limitations by finding a smaller set of prototype vectors for classification. The underlying model, the Voronoi Network (Vnet), discretizes the feature space into Voronoi regions and assigns each region to a class.
The complexity of learning in Vnets is analyzed before developing a hybrid learning algorithm.
Key theoretical questions addressed include:
What class of problems can the model solve efficiently?
How many training examples are sufficient for satisfactory performance?
Does the learning algorithm converge to the global optimum?
These questions are answered using the Probably Approximately Correct (PAC) framework.
The class of problems that can be efficiently solved by Vnets is characterized using the generalized fractal dimension of the set of boundary points of the preimage of the mapping. A bound on the average misclassification error is given in terms of:
number of Voronoi regions
number of training examples
accuracy and confidence parameters
A novel and fast learning algorithm, the Supervised K?Means Algorithm (SKA), has been developed. SKA computes intermediate centers as linear combinations of training examples (based on class labels) and moves the Voronoi centers toward them. SKA is shown to be much faster than LVQ3 and converges to a better solution.
The Evolutionary Supervised K?Means Algorithm (ESKA) uses one step of SKA as its basic operator, augmented with a mutation operator to perturb Vcenter sets. ESKA:
retains all advantages of SKA
additionally converges to a global optimum
performs significantly better than both LVQ3 and SKA on multiple synthetic and real?world datasets
Organization of the Thesis
Since the thesis deals with both unsupervised and supervised learning, it is organized into two parts:
Chapters 2 and 3: Unsupervised learning (development and application of GKA)
Chapters 4 and 5: Supervised learning (Vnet theory, SKA, ESKA, and applications)
Chapter 6: Summary of contributions and discussion of new research directions

