Communication networks are widely used across many industries and sections of society. Such networks may include, for example, telecommunications networks, social media networks, office networks, academia networks and community networks. The use of communication networks is growing, with continual expansion of customer bases driving business growth in this sector.
A competitive business environment ensures that network operators and network service providers are under continuing and significant pressure to improve network performance and increase profit margins. The extraction of information from communication networks is an important factor in enabling active management of network resources in order to meet these pressures. A wide range of information is collected and analysed by network operators and service providers in order to assist in managing the network and in making management decisions concerning the network as a whole and individual users within the network. The scale of some commercial communication networks means that a large number of these decisions must be automated, and many such decisions are made using machine learning models.
The identification of particular customers is a key task often accomplished using machine learning models. For example, with appropriate training data a machine learning model may be trained to identify customers at risk of leaving the network or ceasing to be a customer of the service (churn prediction), customers likely to be most responsive to a particular advertising campaign (targeted advertising) or content (targeted content delivery) or customers most likely to take up particular services (targeted service recommendation).
Many machine learning models currently employed use the similarity between a pair of examples in a dataset as an underlying measure, for example as part of a clustering process or as part of a classification process. In the telecommunications domain, such examples may be network users, and the dataset may be the entire network or a particular subset of the network in question. The “k-means” clustering and “k-nearest neighbour” classifier are examples of clustering and classification algorithms respectively, which use similarity between two examples as a key computational measure.
Computation of similarity between users in a network is thus a key process facilitating the taking of network management decisions through a wide range of machine learning models. Similarity between two network users can be established based on various different factors, and the computation of similarity between users is often specific to a particular application domain. In many application domains, global network features are the key user attributes, and it is most efficient to represent user data numerically, as a vector of attribute values known as a feature vector. Feature vectors can be extracted from network data and any suitable distance function can be used to compute the similarity between two users as represented by their feature vectors. Examples of such distance functions, or metrics, include the Euclidean distance function and the Taxicab distance function.
In other application domains, including for example social network analysis, local interaction data is of greater importance in establishing similarity between users. In such applications, data is represented as a graph, in which each node represents a user and the links or edges between the nodes represent the relationships between the users. Various methods exist for computing the similarity between users from such graphic relationship data. One example is the random walk based method, which can be used to measure the similarity between two nodes in a graph, and hence between the two users represented by those nodes.
In the telecommunications domain, either of the above discussed approaches to calculating similarity can be employed. In a first approach, user feature vectors can be derived from call and/or customer relations management data and an appropriate distance metric can be applied to calculate similarity between users. In a second approach, the social interaction between various users within the network can be extracted from call records data and represented on a call graph, in which each node represents a user and each edge between nodes represents interaction between users. An appropriate analysis technique can then be employed to calculate similarity between users.
In many practical cases, it is desirable to use both of the above representations to compute similarity between customers. Considering for example the case of churn prediction in a telecommunications network, user spending and top up behaviour would be important attributes on which to base a similarity calculation. A feature vector representing spending and top up behaviour can therefore be extracted from the appropriate call and customer relations data for each user and a suitable distance metric used to calculate similarity between users. However, studies have indicated that Quality of Service (QoS) is also a key factor in user churn behaviour. QoS perception is highly individual and subjective, but users having a high degree of psychological similarity may be expected to have similar QoS requirements, and hence similar churn behaviour. It would therefore be useful to calculate user similarity based on psychological factors such as perception, personality, behaviour, attitude, values etc. Such psychological similarity is very difficult to model explicitly, but it can be approximated through calculation of social similarity. It has been shown that in general terms, a higher social similarity between two persons is associated with a higher psychological similarity. Social similarity between two network users can be computed by representing user social interaction on a call graph and calculating similarity using graph analysis methods as discussed above.
Churn prediction is merely one of many examples in which both feature vector and call graph representation would provide useful insight into the computation of similarity between users.
One way of computing user similarity through both feature vector representation and call graph representation is to compute similarity using both representations separately and then take a weighted average of the two similarity values. Considering the case of users x and y, similarity between x and y based on feature vector representation Sfv(x, y) and similarity between x and y based on call graph representation Sgr(x, y), the combined value of similarity S for users x and y is given by the following formula:S(x,y)=α*Sfv(x,y)+(1−α)*Sgr(x,y)where α is a weighting factor having any value in the interval (0, 1).
While the above described method is relatively simple, establishing the correct value for the weighting factor α is more complicated. Taking the above discussed example of churn prediction, it is not immediately evident to a network operator or service provider whether more weight should be give to the feature vector based similarity (relating to spending and top up behaviour) or the call graph based similarity (relating to social interaction). One quantitative approach to establishing a weighting factor comprises:    1) For various possible values of α:    a) Compute similarity using the value of α;    b) Train a machine learning model (e.g. churn prediction) using the computed similarity;    c) Test the performance of the trained model on a test data set    2) Select the value of a resulting in the machine learning model which had the best performance on the test data set.
This approach to determining α is reasonable for domains in which the size of the data set does not exceed a certain limit. However, in the telecommunications domain, where a single network operator or service provider may have hundreds of millions of users, the scale of the data set is measured in terabytes, and the above trial and error approach involving training and testing multiple models in order to arrive at a value for a is simply not feasible.