Efficient Kernel Methods For Large Scale Classification

Asharaf, S

dc.contributor.advisor	Narasimha Murty, M
dc.contributor.author	Asharaf, S
dc.date.accessioned	2011-02-22T05:30:06Z
dc.date.accessioned	2018-07-31T04:40:05Z
dc.date.available	2011-02-22T05:30:06Z
dc.date.available	2018-07-31T04:40:05Z
dc.date.issued	2011-02-22
dc.date.submitted	2007
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/1076
dc.description.abstract	Classification algorithms have been widely used in many application domains. Most of these domains deal with massive collection of data and hence demand classification algorithms that scale well with the size of the data sets involved. A classification algorithm is said to be scalable if there is no significant increase in time and space requirements for the algorithm (without compromising the generalization performance) when dealing with an increase in the training set size. Support Vector Machine (SVM) is one of the most celebrated kernel based classification methods used in Machine Learning. An SVM capable of handling large scale classification problems will definitely be an ideal candidate in many real world applications. The training process involved in SVM classifier is usually formulated as a Quadratic Programing(QP) problem. The existing solution strategies for this problem have an associated time and space complexity that is (at least) quadratic in the number of training points. This makes the SVM training very expensive even on classification problems having a few thousands of training examples. This thesis addresses the scalability of the training algorithms involved in both two class and multiclass Support Vector Machines. Efficient training schemes reducing the space and time requirements of the SVM training process are proposed as possible solutions. The classification schemes discussed in the thesis for handling large scale two class classification problems are a) Two selective sampling based training schemes for scaling Non-linear SVM and b) Clustering based approaches for handling unbalanced data sets with Core Vector Machine. To handle large scale multicalss classification problems, the thesis proposes Multiclass Core Vector Machine (MCVM), a scalable SVM based multiclass classifier. In MVCM, the multiclass SVM problem is shown to be equivalent to a Minimum Enclosing Ball (MEB) problem and is then solved using a fast approximate MEB finding algorithm. Experimental studies were done with several large real world data sets such as IJCNN1 and Acoustic data sets from LIBSVM page, Extended USPS data set from CVM page and network intrusion detection data sets of DARPA, US Defense used in KDD 99 contest. From the empirical results it is observed that the proposed classification schemes achieve good generalization performance at low time and space requirements. Further, the scalability experiments done with large training data sets have demonstrated that the proposed schemes scale well. A novel soft clustering scheme called Rough Support Vector Clustering (RSVC) employing the idea of Soft Minimum Enclosing Ball Problem (SMEB) is another contribution discussed in this thesis. Experiments done with a synthetic data set and the real world data set namely IRIS, have shown that RSVC finds meaningful soft cluster abstractions.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G21537	en_US
dc.subject	Machine Learning	en_US
dc.subject	Automatic Classification	en_US
dc.subject	Kernel Method	en_US
dc.subject	Classification Algorithms	en_US
dc.subject	Support Vector Machine (SVM)	en_US
dc.subject	Core Vector Machine (CVM)	en_US
dc.subject	Rough Support Vector Clustering (RSVC)	en_US
dc.subject	Multiclass Core Vector Machine (MCVM)	en_US
dc.subject.classification	Computer Science	en_US
dc.title	Efficient Kernel Methods For Large Scale Classification	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.discipline	Faculty of Engineering	en_US

Files in this item

Name:: G21537.pdf
Size:: 741.6Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [377]

Show simple item record