1. Field of the Invention
The present invention generally relates to techniques for statistical relational learning, and more particularly to techniques for making relational classifications on a single connected network.
2. Background Description
Given the prevalence of large connected relational graphs across diverse domains, single or within network classification has been one of the popular endeavors in statistical relational learning (SRL) research. Ranging from social networking websites to movie databases to citation networks, large connected relational graphs are banal. In single network classification, we have a partially labeled data graph and the goal is to extend this labeling, as accurately as possible, to the unlabeled nodes. The nodes themselves may or may not have associated attributes. An example where within network classification could be useful is in forming common interest groups on social networking websites. For instance, a group of people in the same geography may be interested in playing soccer and they would be interested in finding more people who are likely to have the same interest. In a different domain such as entertainment, one might be interested in estimating which of the new movies is likely to make a splash at the box office. Based on the success of other movies that had some of the same actors and/or the same director, one could provide a reasonable estimate of which movies are most likely to be successful.
Many methods that learn and infer over a data graph have been developed in SRL literature. Some of the more effective methods perform collective classification, that is, besides using the attributes of the unlabeled node to infer its label, they also use attributes and labels of related nodes/entities. These are thus a generalization of methods that assume that the data is independently and identically distributed (i.i.d.). Examples of such methods are relational Markov networks (RMNs), relational dependency networks (RDNs), Markov logic networks (MLNs), and probabilistic relational models (PRMs). These all fall under the umbrella of Markov networks. There have been simpler models suggested as baselines, such as relational neighbor classifiers (RN) which simply choose the most numerous class label amongst their neighbors to more involved variants such as those using relaxation labeling. Interestingly, these simple models perform quite well when the auto-correlation is high, even though the graph may be sparsely labeled. Recently, a pseudo-likelihood expectation maximization (PL-EM) method was introduced, which seems to perform favorably to other methods when the graph has a moderate number (around 20-30%) of labeled nodes.
A different class of methods that could potentially address the problem at hand are graph transduction methods, which are a part of semi-supervised learning methods and in some sense are the i.i.d. counterpart of relational methods. These methods typically perform well when we are given a weighted graph and the linked nodes have mostly the same labels—unless apriori dissimilar nodes are explicitly specified—, even if only a small fraction of the labels are known. If a weighted graph is not readily available, it is constructed from the (explanatory) attributes of the nodes. If an unweighted graph with no attributes is given, then the adjacency matrix is passed as input.
In relational learning, the graphs are typically unweighted and sometimes may not have attributes. In many cases, the attributes may not accurately predict the labels, in which case, weighting the edges solely on them may not provide acceptable results. The links could be viewed as an additional source of information to determine labels amongst connected nodes. Thus, the weights should also be functions of the known labeling. Some of these intuitions are captured in the relational gaussian process model, but it is limited to undirected graphs and the suggested kernel function is not easy to adapt to relational settings where we may have heterogeneous data.