Outlier Detection with Applications in Graph Data Mining

Ranga Suri, N N R

dc.contributor.advisor	Narasimha Murty, M
dc.contributor.author	Ranga Suri, N N R
dc.date.accessioned	2018-04-24T04:34:33Z
dc.date.accessioned	2018-07-31T04:39:05Z
dc.date.available	2018-04-24T04:34:33Z
dc.date.available	2018-07-31T04:39:05Z
dc.date.issued	2018-04-24
dc.date.submitted	2013
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/3447
dc.identifier.abstract	https://etd.iisc.ac.in/static/etd/abstracts/4314/G25969-Abs.pdf	en_US
dc.description.abstract	Outlier detection is an important data mining task due to its applicability in many contemporary applications such as fraud detection and anomaly detection in networks, etc. It assumes significance due to the general perception that outliers represent evolving novel patterns in data that are critical to many discovery tasks. Extensive use of various data mining techniques in different application domains gave rise to the rapid proliferation of research work on outlier detection problem. This has lead to the development of numerous methods for detecting outliers in various problem settings. However, most of these methods deal primarily with numeric data. Therefore, the problem of outlier detection in categorical data has been considered in this work for developing some novel methods addressing various research issues. Firstly, a ranking based algorithm for detecting a likely set of outliers in a given categorical data has been developed employing two independent ranking schemes. Subsequently, the issue of data dimensionality has been addressed by proposing a novel unsupervised feature selection algorithm on categorical data. Similarly, the uncertainty associated with the outlier detection task has also been suitably dealt with by developing a novel rough sets based categorical clustering algorithm. Due to the networked nature of the data pertaining to many real life applications such as computer communication networks, social networks of friends, the citation networks of documents, hyper-linked networks of web pages, etc., outlier detection(also known as anomaly detection) in graph representation of network data turns out to be an important pattern discovery activity. Accordingly, a novel graph mining method has been envisaged in this thesis based on the concept of community detection in graphs. In addition to finding anomalous nodes and anomalous edges, this method is capable of detecting various higher level anomalies that are arbitrary sub-graphs of the input graph. Subsequently, these ideas have been further extended in this thesis to characterize the time varying behavior of outliers(anomalies) in dynamic network data by defining various categories of temporal outliers (anomalies). Characterizing the behavior of such outliers during the evolution of the network over time is critical for discovering different anomalous connectivity patterns with potential adverse effects such as intrusions into a computer network, etc. In order to deal with temporal outlier detection in single instance network/graph data, the link prediction task has been leveraged in this thesis to produce multiple instances of the input graph. Thus, various outlier detection principles have been successfully applied for mining various categories of temporal outliers(anomalies) in the graph representation of network data.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G25969	en_US
dc.subject	Data Mining	en_US
dc.subject	Graph Data Mining	en_US
dc.subject	Outlier Detection	en_US
dc.subject	Categorical Data - Outlier Detection	en_US
dc.subject	Network/Graph Data - Outlier Detection	en_US
dc.subject	Graph Data Mining - Outlier Detection	en_US
dc.subject	Outliers	en_US
dc.subject	Rough Clustering Algorithm	en_US
dc.subject.classification	Computer Science	en_US
dc.title	Outlier Detection with Applications in Graph Data Mining	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.discipline	Faculty of Engineering	en_US