dc.description.abstract | Making recommendations or predicting links which are likely to exist in the future is one
of the central problems in network science and graph mining. In spite of modern state-of-
the-art approaches for link prediction, the traditional approaches like Resource Allocation
Index, Adamic Adar still fi nd heavy use, because of their simplistic nature yet competitive
performance. Our preliminary investigation reveals that a major fraction of missing links
which are observed in the near future, are the links which are between one-hop distant
nodes. Current \friend of friend is a friend" based approaches, provide a mechanism for
aggregating the effect of triangles getting closed by inclusion of an edge. In this work we
look beyond these triangles, and hunt for higher order structures in common neighborhood.
However, with an idealistic choice of cliques as higher order structures, the problem easily
gets out of hand, even for common ego-networks involving 50-100 nodes. In wake of this,
we de ne a dense structure k-support, as an approximation to the clique. Further, for a
given k value, we propose a goodness measure to capture the contribution of the common
neighborhood w.r.t. k-support, towards the link strength. The proposed goodness measure
with different k values is then exploited in learning a supervised model.
We next take a multi-class classifi cation view to the recommendation problem. This
is suited for the cases where cardinality of the set of target objects is reasonably small.
Classi fication of entities based on the underlying network structure is an important problem
and has been studied extensively in literature. Networks encountered in practice are sparse
and have many missing and noisy links. Statistical learning techniques have been used in
intra-network classi fication; however, they typically exploit only the local neighborhood, so
may not perform well. We propose a novel structural neighborhood-based classi fier learning
using a random walk, where we label a node based on how nodes in the respective kth-level
neighborhood are labeled. We observe that random walks of short length are helpful in
classi fication, while emphasizing role of longer random walks may cause the underlying
Markov chain to converge to a stationary distribution. Considering this, we take a lazy
random walk based approach with variable termination probability for each node, based on
the node's structural properties including its degree.
Further, we observe that conventional loss functions penalize a node based on a function
of its predicted label and target label. Such loss functions under-perform while learning on
a network having overlapping classes. In relational setting, even though the ground truth
is not known for the unlabeled nodes, some evidence is present in the form of labeling acquired by the nodes in their neighborhood. Considering this, we propose a structural
loss function for learning in networks based on the hypothesis that loss is induced when a
node fails to acquire a label that is consistent with the labels of the majority of the nodes
in its neighborhood. We further combine this with a novel semantic regularizer, which we
call homophily regularizer, to capture the smooth transition of discriminatory power and
behavior of semantically similar nodes.
Hybrid recommender systems have shown the power of exploiting relationships amongst
objects which directly or indirectly effect the recommendation task. However, the effect of
all relations is not equal, and choosing their right balance for a recommendation problem
at hand is non-trivial. We model these interactions using a Heterogeneous Information
Network, and propose a systematic framework for learning their inluence weights for a
given recommendation task. We address the issue of redundant results, which is very much
prevalent in recommender systems. To alleviate redundancy in recommendations we use
Vertex Reinforced Random Walk (a non-Markovian random walk) over a heterogeneous
graph. It works by boosting the transitions to the influential nodes, while simultaneously
shrinking the weights of others. This helps in discouraging recommendation of multiple
influential nodes which lie in close proximity of each other, thus ensuring diversity. | en_US |