With the increase in communication networks, the need for correctly identifying the influential members/users is also increasing. Finding influential users in various communication networks including social networks, telecommunication networks, social media, office networks, academia networks, community networks etc., is an important task. Particularly, the identified list of influential users in a telecommunication network may be used in targeting advertisement/service campaigns for the relevant network service provider and could potentially have a viral effect in providing value added services to the network service provider. Each operator might have a different definition of who the influential users are, for instance, the influential person may be a person who has been with the network for long time, or an influential user may be someone who uses all the advanced services of the operator or someone who is well connected to many others in the network etc.
Typically, a single network measure such as authority, centrality or prestige measure is used to capture the influence of a user. Some of the network measures that are used to capture the influence of a user are eigenvector centrality, degree centrality, closeness centrality, etc. However, the single network measures may not be predictive in nature, i.e., the single network measures may not precisely identify whether a user is influential or not or how much influential the user is, say, very high, high, medium, low or very low. Also, a single network measure may not be good enough to capture the influence of a user in a given network accurately.
For example, in a telecommunication domain, degree centrality captures number of calls/sms a user makes, which can even be to a set of people who call among themselves, say a calling group, while betweenness centrality is based on how many users are non-trivially connected to the concerned user/node, i.e., connectivity must be such that the information diffusion throughout the network is facilitated. Other factors in a telecom domain may be call usage count, number of refills made, amount of money spent on calls/SMS (Short Message Service)/Data traffic, etc.
The definition of information diffusion/influence of a user in a network depends on the context of the network. For example, influence in a twitter dataset is based on re-tweets, number of followers, etc. In citation graphs, the information diffusion would be based on citations, number of downloads and other extrinsic measures that can be captured. The definitions of ‘influential users’ hence vary with each end-user or requirement as well. For example, with a telecommunication operator, influential users would vary with the objective of a particular ad campaign or promotion activity for which the influential users are identified.
Another method to identify influential users is to use a predictive model by combining different centrality/social network influence measures which is known as supervised learning based method. The supervised learning based method has labeled data to learn the weights of each method in advance, and uses the labeled data for combining scores/rank aggregation on test data. Although the supervised learning based method is an effective method, however getting sufficient labeled data (supervision) for domains which consist of large-scale data is practically infeasible. Further, there are domains that have “cold-start” trouble naturally, for example in telecommunication domain, there is no single definition of which these influential users are, and there is no labeled data to learn the weights. Thus, to build a supervised learning based prediction model is not possible, which can take various user measures (both network and attribute based) as input and give ‘influential or not’ as output. Further, the unsupervised methods that either uses a single measure or that uses the combination by maximizing consensus between models might lead to a spurious result.
Hence, there is a well-felt need for an effective method and system for identifying influential users in a communication network.