This disclosure relates to a process for deriving a composite tie metric for an edge between nodes of a telecommunication call graph based on multiple attributes of the edge. For example, the disclosure describes an exemplary embodiment of a method for deriving a composite tie metric for the edge that takes into account values of the multiple attributes for the edge, distribution of values for the multiple attributes of edges in the telecommunication call graph, conversion of the multiple attributes to a common scale, and weighting the multiple attributes in relation to sensitivity of the composite tie metric to the relative contribution of each attribute. The disclosure also describes an exemplary embodiment of an apparatus for deriving a composite tie metric for the edge based on historical data records for communications via a telecommunication network associated with the telecommunication call graph. Various embodiments of the methods and apparatus described herein may be used in conjunction with providing advice to service provider regarding churn prediction. However, methods and apparatus described herein may be used for other purposes, such as prediction of telecommunication service adoption, targeted advertisement, targeted marketing, anomaly detection, and other uses that can benefit from prediction of user behavior.
Social Network Analysis (SNA) is a powerful approach used to better understand the behaviors and relationships of users. SNA is traditionally applied in the context of online social networks (OSNs) such as Facebook, Flickr, and Twitter, where users can directly establish ties, share information, and join groups to connect to users with similar interests. In these networks, SNA operates over attributes that directly imply a social connection between users. For example, the fact that two users are friends on an OSN, that they belong to the same groups, or that they share information with each other can each be used individually to infer that a social tie exists.
Such OSNs contain causal information, that is, data attributes which imply the existence of a social tie. There exist other social networks, however, where only the effects of a social tie are observable. Each effect, taken alone, does not directly suggest social tie strength. Mobile call graphs are an example of such a social network. In a mobile call graph, the effects of a strong social tie may include a large number of calls placed, a long time spent talking, and many calls during weekend and evening hours. By themselves, however, none of the attributes directly imply the tie strength. For example, a user may call a bank to check balances and pay bills more times than they call a friend, even though friendship is a stronger social tie.
Mobile call graphs represent the way in which a large number of users communicate with each other, and these patterns of communication are related to the social ties between people. Thus, studies that apply SNA to mobile call graphs are rising in popularity. Such studies, however, only pick a single feature about the calls made between two users to define a social relationship. As a result, the conclusions drawn by these studies are based only on a single effect of a possible social relationship that exists. In order to make observations about a call graph that more faithfully considers the social relationship between users, an improved measure of tie strength is needed.
For additional information on studies that apply SNA to mobile call graphs, see, for example: i) Dasgupta et al., Social Ties and their Relevance to Churn in Mobile Telecom Networks, Proceedings of 11th ACM International Conference on Extending Database Technology, Mar. 25-30, 2008, pp. 668-677; ii) Onnela et al., Structure and tie strengths in mobile communication networks, Proceedings of the National Academy of Sciences of the United States, vol. 104, no. 18, May 1, 2007, pp. 7332-7336; iii) Richter et al., Predicting customer churn in mobile networks through analysis of social groups, Proceedings of SIAM International Conference on Data Mining, Apr. 29-May 1, 2010, pp. 732-741; iv) Seshardi et al., Mobile Call Graphs: Beyond Power-Law and Lognormal Distributions, Proceedings of 14th ACM Conference on Knowledge Discovery and Data Mining, Aug. 24-27, 2008, pp. 596-604; and v) Nanavati et al., On the Structural Properties of Massive Telecom Call Graphs: Findings and Implications, Proceedings of 15th ACM Conference on Information and Knowledge Management, Nov. 5-11, 2006, pp. 435-444. The contents of these five documents are fully incorporated herein by references.
Existing solutions to the problem of calculating social tie strength are applicable to online social networks (OSN) where causal information exists, that is, data attributes exist that imply the existence of a social tie. For example, the fact that two users are friends on an OSN, that they belong to the same groups, or that they share information with each other can each be used individually to infer that a social tie exists. For phone networks, however, only the effects of a social tie are observed. Each effect, taken alone, does not directly suggest tie strength. In a mobile call graph, for example, the effects of a strong social tie may include a large number of calls placed, a long time spent talking, and many calls during weekend and evening hours. By themselves, however, none of the attributes directly imply the tie strength. For example, a user may call a bank to check balances and pay bills more times than they call a friend, even though friendship is a stronger social tie.
There are a number of studies that apply SNA to mobile call graphs. Such studies, however, only pick a single feature about the calls made between two users to define a social relationship. As a result, the conclusions drawn by these studies are based only on a single effect of a possible social relationship that exists. In order to make observations about a call graph that more faithfully considers the social relationship between users, an improved measure of tie strength is needed.
Detecting anomalous behavior on mobile call graphs has several advantages—links where the usage drops significantly can be early indicators of nodes likely to churn in the future due to the reduced calling activity. On the other hand, significantly increased activity can indicate the forming of new ties and likely links for influence propagation. Anomaly detection across call graphs has applications for law enforcement agencies. Overseas calling, calls placed during unusual hours of the day, increased calling activity between certain nodes, and even the underuse or overuse of a service (associated with ‘throw-away’ phones) may be detected as anomalous signals that law enforcement agencies utilize when investigating a case. Use of the composite tie variation metric facilitates anomaly detection by incorporating abnormal activity across many attributes into a single metric.
Anomaly detection is also applicable to monitoring traffic on a link in a telecommunications network. A telecommunications network is comprised of network nodes (i.e., telecom equipment) and links that connect the different network nodes and transport traffic. In order to obtain the best performance from the network, it is beneficial to have the links at a certain level of activity and to be able to detect if the links deviate from their normal expected behavior. For example, an over loaded link may degrade performance and bring down the network, while an under-utilized link represents lost revenue opportunity. Detection of anomalous behavior of links enables an operator to take actions and rectify the situation. For accurate anomaly detection one needs to take into account multiple factors of performance of the link; not just a single measurement.
For these and other reasons, there is a need to define a composite metric representative of edges between nodes of a telecommunication call graph based on multiple characteristics of the edges.