Retweet Profiling - Study Dissemination of Twitter Messages
Abstract
Social media has become an important means of everyday communication. It is a mechanism for “sharing” and “resharing” of information. While social network platforms provide the means to users for resharing/reblogging (aka retweeting), it remains unclear what motivates users to share. Predicting the spread of content is quite important for several purposes such as viral marketing, popular news detection, personalized message recommendation and on-line advertisement. Social content systems store all the information produced in the interactions between users. However, to turn this data into information that allows us to extract patterns, it is important to consider the different phenomena involved in these interactions. In this work, two phenomena that influence the evolution of networks are studied for Twitter: diffusion of information and communication among users.
Previous studies have shown that history of interaction among users and properties of the message are good attributes to understand the retweet behavior of users. Factors like content of message and time are less investigated. We propose a prediction model for retweet actions of users. It formulates a function which ranks the users according to how receptive they are to a particular message. The function generates a confidence score for the edges joining the initiator of the message and the followers. Two different pieces of information propagate through different users in the network. We divide the task of calculating confidence score into two parts. The first part is independent of the test tweet. It models transmission rate of the tie between the initiator and the follower. We call this as ‘Pairwise Influence Estimation’. The second part incorporates the tweet properties and user activeness as per time in the ranking function. The proposed model exploits all the dimensions of information dif-fusion process-influence, content and temporal properties. We have captured local aspects of diffusion.
It has been observed that users do not read all the messages on their site. This results in shortcomings in the above models. Considering this, we first study the temporal behavior of users’ activities, which directly reflects their availability pertaining to the upcoming post. Also, as it is a continuous task of predicting retweet behavior, we design a user-centric, and temporally localized incremental classification model by considering the fact that users do not read all their tweets. We have tested the effectiveness of this model by using real data from Twitter. We demonstrate that the new proposed model is more accurate in describing the information propagation in microblog compared to the existing methods. Our model works well when we consider different classes of users depending on their activity patterns. In addition, we also investigate the parameters of the model for different classes of users. We report some interesting distinguishing patterns in retweeting behavior of users.