One or more specific embodiments of the present invention will be described below. In an effort to provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.
A community may be any cluster or group of nodes within a network or graph wherein the nodes are more connected to one another than to a different set of nodes within the network or graph. Further, a network or graph may be a structure such as a complex gene network, a social network, a business organization, interlinked data, or a computer network. More generally, a network or graph may be defined as any group of nodes containing nodes interconnected by edges, wherein an edge may be a line representing a commonality between two or more nodes, such as a communication or a shared characteristic. For example, a network or graph may be an informal social network wherein nodes are individual persons connected by communication patterns and wherein smaller communities are embedded within the larger network. In another example, a network or graph may be an organization wherein the nodes are individuals within the organization that link together by e-mail communications.
Information regarding these network/graph embedded communities may be extracted using techniques for defining and studying networks or graphs of linked nodes. Specifically, these techniques may provide the ability to define communities within the network and may even indicate certain node characteristics (e.g. determine which individual person in an organization is a group leader). In general a community may be defined as a cluster of entities with commonalities forming a unit within a larger unit. Identifying communities, however, may be hampered because it may be difficult to identify a relationship between nodes in a large or complex network. It may take a relatively long time to identify and uncover the membership of communities in such a network.
Existing methods for discovering communities require algorithms that do not scale well with the size of the network or graph containing the communities. For example, in utilizing some methods, finding communities may require an amount of time that is of the order of the fourth power of the number of nodes in a network or graph. Thus, the existing methods may become very slow when operating on large networks or graphs which may even have an undefined structure that is essentially infinite. While there are some heuristics that exhibit performance times that are linear with the size of the graph or network, they may not allow for discovering the community around a single node without solving the whole problem.