The present invention relates to the field nodal networks, and more specifically to suggesting a relationship for a node pair based upon shared connections versus total connections.
Many situations exist within computer science, where data can be represented as a set of nodes, which are linked to other nodes through a definable relationship. Often this relationship is visually shown using a graph, where edges connecting nodes represent the relationship. These relationships can be unidirectional or directional.
Often, it can be desirable to infer an existence of new direct associations between two nodes, which otherwise lack a direct connection. In other words, this disclosure concerns a link prediction problem of inferring direct linkages between nodes using features intrinsic to a nodal network. An ability to effectively predict or infer new direct links would have broad applicability in many data centric fields. Application could include, for example, suggesting new links among Web pages, determining new interactions in protein-protein networks, ascertaining new citations for scientific or academic articles, automatically detecting existing work related to a reference set of scientific projects, inferring new direct connections (i.e., introducing a pair of previous strangers) in a social networking context, and the like. Another embodiment of this disclosure is the measuring of strength between nodes, which already are connected.
One known technique for inferring new relationships uses metadata about a node, such as textual information or background information to build new associations. For instance, in a social networking context, a user can be required/encouraged to provide additional information, such as hobbies, hometown, attended schools, occupation, organizations which an individual is involved, etc. This additional information is matched against the information provided by others, which is used to suggest a set of people who a user may wish to interact (i.e., send an invitation to view a personal space, converse via email, etc.). In an electronic document comparison context (photo tourism, for example), tags included within documents can be used to determine a set of related documents sharing a common set of tags. Metadata, however, is often not available or trustworthy, which causes metadata based relationship suggestion techniques to fail.