The invention relates to a method and system for classifying, ranking and relating information based on mathematical graphs and networks. In general, the topology of a network, composed of nodes and links between them, can be studied as a graph.
A graph G is made of a finite set of vertexes or nodes V and another set of edges or links E, defined as non-arranged vertex couples, so if eεE, then e=(u, v) where u, vεV.
Two vertexes u, vεV are neighbors or adjacent if an edge eεE exists in such a way that it links them. The number of vertexes in a graph is known as the order of the graph, |G|.
The degree of a vertex is the number of edges that incise in it and we define P(k) as the probability that a vertex is of k degree. A regular graph of n degree is the one where each node is related exactly with n edges.
A walking between two vertexes x0 and xn is a non-void graph ρ=(V,E) so that
V={x0, x1, . . . , xn}
E={x0, x1, x1x2, . . . , xn-1xn}
where xi≠xj ∀i, jε{1, . . . n}.
The distance between two vertexes of the graph is defined as the shortest walking between them and the average distance among any couple of vertexes is the diameter of the graph.
A guided graph or digraph is the one where the direction of the edge is significant, that is, where each edge connects an initial edge with a final vertex. In digraphs, the direction of the edge is fundamental.
Known network-based ranking systems generally are based on the topologic structure of the network. Generally, these known ranking systems are static systems. For example, one such system uses a link analysis algorithm which assigns a numerical weighting to each element of a hyperlinked set of documents, interpreting each incoming link to a document as a vote to that document and defining a static weight measure for every document saved in a large matrix. One major disadvantage of a static system is that each time a network changes, one generally needs to re-explore the network and re-calculate all the weights. This has the inconvenience of requiring expensive computing process power and delays caused by crawler systems, for example, to re-explore the network. It is generally difficult to have a dynamic and individual rank measure between any elements of the network.
Self-organizing map (SOM), also know as Kohonen map, is a subtype of artificial neural networks, and the general idea is to create a pattern recognition system, utilizing competitive learning in a training step. When a training sample is given to the network, its Euclidean distance to all weight vectors is computed. Here, a weight vector is a representation of the neural network links with associated weight measure. The neuron with the smallest distance to the input is called the Best Matching Unit (BMU). While the SOM method generally works with neural network models, it does not work well with graphs in general. An SOM-based system is typically useful in training model systems but tends to be difficult to adapt for real graphs applications.
It is an object of the present invention to mitigate or obviate at least one of the above mentioned disadvantages and to provide an improved system and method of system for classifying, qualifying and relating information.