Social networks represent the links between a set of entities connected to each other with different types of relationships. For example, papers are linked by citations in a citation network, bloggers are linked by comments or blogrolls in a blog network, while cell phones are connected via phone calls in a cell phone network.
In the literature, social networks have been extensively studied from a graph theory perspective (e.g., power laws, small world phenomenon, coverage, etc.). Also, properties of different types of complex networks have been compared.
Recently, research studies on social networks from a behavioral perspective have received a lot of attention. These works, dealing with problems such as community identification, spam detection, or modeling information flows have a lot of applications in recommender systems, social search, economics, and advertising.
A fundamental issue in analyzing information flow or propagation patterns within communication oriented social networks is how to represent the communication data in such a way that it captures every piece of useful information. In the literature, a few alternatives have been proposed and used to model the interactions among people where typically each user is represented as a node in the graph and each interaction as an edge in the graph or aggregating interactions between users by adding weights to the graph representation.
These representations are meaningful and valid in certain social networks, such as friends or citation networks, where the nature of the relationship is embedded in—or may be easily derived from the records. However, in the case of social networks derived from communication logs (e.g. phone calls), it is difficult to properly infer the nature of the relationships due to the multiplicity of reasons in making a call (e.g., business, personal, service, etc.) and the role that the temporal context plays on the communication. In other words, once one paper cited another, the relationship between both papers always holds true.
However, phone calls are made for different reasons and hence the nature of a relationship between two nodes in the network may also depend on the temporal context of the calls, i.e., a call made during working hours is probably of different nature than a call made at night. The same applies to other temporal attributes, such as duration and frequency of the interaction, or temporal distance between two calls (inter-call time delay). As a result, the representations that are used in existing information propagation studies are not valid in the context of phone communications. Furthermore, many studies on information propagation assume that consecutive interactions transmit the same piece of information within the inferred networks, which is not necessarily true in phone communications.
The recent availability of large amounts of data from a variety of networks (e.g., online social networks, social media and user generated content networks, proteins, etc.) has enabled the analysis of information propagation in such networks. Research work has been done to analyze cell phone, instant messaging, blog, Flickr, email, and protein interaction networks.
More recently, the dynamic properties of large scale social networks have been studied extensively where the temporal annotations of each communication are used to partition the entire dataset into a time series of snapshots, which is then studied in terms of its temporal patterns. Along these lines, research has been carried out on studying information cascading triggered by specific events. Other related work has been done on maximizing the influence within social networks from historical behaviour patterns and with probabilistic models, which solve the problem of viral marketing given the constraint of maximizing the influence over the network.
Kleinberg et al. in “The structure of information pathways in a social communication network” (In KDD, 2008) propose the temporal distance concept to find an information pathway in the network called the backbone structure, where information has the highest probability of flowing based on temporal communication habits.
The concept of motif originated in biology, where it has been defined as patterns that recur within a (transcription regulation) network much more often than expected at random. Within transcription regulation networks, research has been done to experimentally show that these network motifs are the building blocks of the network and play functional roles such as auto-regulation, pulse generator and response accelerators.
Individual users may make and receive a lot of phone calls. Traditional approaches create a link between adjacent calls that share at least one user, which may not be accurate enough to reflect how users communicate collectively and how information is propagated over the network. For example, two adjacent calls that are about two different pieces of information should not be used to create a path of information propagation. Note that the problem of identifying the piece of information that is propagated in each phone call or any other social interaction is still an open problem since usually the associated content is either not available or is too privacy-sensitive to be public.
The studies mentioned above have mainly focused on analyzing the structural or topological properties of different types of networks and building models that explain the data. However, there has been no work to date that combines the global topological properties of the network with local behavior patterns in order to shed light on the core principles of collective structural patterns. In addition, previous approaches have typically ignored the temporal attributes and strength of each individual communication.