The Internet has given rise to two disparate phenomena: the increase in the presence of social networks, with their corresponding member profiles visible to large numbers of people, and the increase in online business transactions. These online business transactions include not just consumer purchases (business to consumer), but also purchases by businesses (business-to-business). With regard to business-to-business transactions, performing accurate predictions of events involving those companies can be challenging. These problems are known as company-level prediction problems. For example, a first business may be a repeat customer of a second business, but it can be difficult to predict if the first business may elect to stop purchasing from the second business (known as “chum”). This is despite the fact that these company-level prediction problems are often binary in nature (i.e., the result of the prediction is either a yes or a no).
Traditionally company-level prediction problems are solved using machine-learning techniques. Traditional classifiers may be applied to a decision tree using company-level features. The problem with this approach is that the result is hugely dependent on the quantities of the training data. If a prediction is attempted for a million businesses, training on data of only a thousand businesses we know have churned (and can be labeled as “churners”) leads to inaccurate predictions. Thus, the machine learning techniques are not scalable, at least insofar as the training data also is not scalable. Additionally, the features used in such techniques are independent to the company being examined. This ignores relationships to other companies and relationships between employees.