Recommendations in Complex Networks: Unifying Structure into Random Walk
MetadataShow full item record
Making recommendations or predicting links which are likely to exist in the future is one of the central problems in network science and graph mining. In spite of modern state-of- the-art approaches for link prediction, the traditional approaches like Resource Allocation Index, Adamic Adar still fi nd heavy use, because of their simplistic nature yet competitive performance. Our preliminary investigation reveals that a major fraction of missing links which are observed in the near future, are the links which are between one-hop distant nodes. Current \friend of friend is a friend" based approaches, provide a mechanism for aggregating the effect of triangles getting closed by inclusion of an edge. In this work we look beyond these triangles, and hunt for higher order structures in common neighborhood. However, with an idealistic choice of cliques as higher order structures, the problem easily gets out of hand, even for common ego-networks involving 50-100 nodes. In wake of this, we de ne a dense structure k-support, as an approximation to the clique. Further, for a given k value, we propose a goodness measure to capture the contribution of the common neighborhood w.r.t. k-support, towards the link strength. The proposed goodness measure with different k values is then exploited in learning a supervised model. We next take a multi-class classifi cation view to the recommendation problem. This is suited for the cases where cardinality of the set of target objects is reasonably small. Classi fication of entities based on the underlying network structure is an important problem and has been studied extensively in literature. Networks encountered in practice are sparse and have many missing and noisy links. Statistical learning techniques have been used in intra-network classi fication; however, they typically exploit only the local neighborhood, so may not perform well. We propose a novel structural neighborhood-based classi fier learning using a random walk, where we label a node based on how nodes in the respective kth-level neighborhood are labeled. We observe that random walks of short length are helpful in classi fication, while emphasizing role of longer random walks may cause the underlying Markov chain to converge to a stationary distribution. Considering this, we take a lazy random walk based approach with variable termination probability for each node, based on the node's structural properties including its degree. Further, we observe that conventional loss functions penalize a node based on a function of its predicted label and target label. Such loss functions under-perform while learning on a network having overlapping classes. In relational setting, even though the ground truth is not known for the unlabeled nodes, some evidence is present in the form of labeling acquired by the nodes in their neighborhood. Considering this, we propose a structural loss function for learning in networks based on the hypothesis that loss is induced when a node fails to acquire a label that is consistent with the labels of the majority of the nodes in its neighborhood. We further combine this with a novel semantic regularizer, which we call homophily regularizer, to capture the smooth transition of discriminatory power and behavior of semantically similar nodes. Hybrid recommender systems have shown the power of exploiting relationships amongst objects which directly or indirectly effect the recommendation task. However, the effect of all relations is not equal, and choosing their right balance for a recommendation problem at hand is non-trivial. We model these interactions using a Heterogeneous Information Network, and propose a systematic framework for learning their inluence weights for a given recommendation task. We address the issue of redundant results, which is very much prevalent in recommender systems. To alleviate redundancy in recommendations we use Vertex Reinforced Random Walk (a non-Markovian random walk) over a heterogeneous graph. It works by boosting the transitions to the influential nodes, while simultaneously shrinking the weights of others. This helps in discouraging recommendation of multiple influential nodes which lie in close proximity of each other, thus ensuring diversity.