1. Field of the Invention
The present invention relates generally to methods for analyzing relational systems where nodes have local interactions or links, and in particular to methods for analyzing linked databases.
2. Description of Related Art
Current search engines use indexing of terms as their foundation, and it is presumed that semantics of the documents and of the user information need can be naturally expressed through sets of index terms. This index is generated as an inverted index from terms to documents. This approach has many fundamental problems. People tend to use different wordings to describe a concept, and sometime employ a whole paragraph or document to describe a new concept. Even with improved semantic processing such as checking synonyms and variations of a word keyword, indexing usually fails to extract relevant content associated with a non-simple or non-trivial concept. Documents retrieved in response to a user request expressed as a set of keywords are frequently irrelevant because users often do not know how to properly form queries in Boolean Expression, or cannot generate a suitable set of keywords describing the knowledge or concept being sought. Thus, the issue of predicting which documents are relevant, and which are not, is a central problem of information retrieval systems.
Existing search engines generally use keyword-based categorization, indexing, matching, presenting, navigating, ranking, and managing a plurality of documents. Ranking algorithms generally establish a simple ordering of the documents retrieved in response to a query. Documents appearing at the top of this ordering are considered to be more likely to be relevant. Different global and local ranking schemes have been used to establish documents such as web pages that are relevant to the likelihood of user's needs. These methods include variations of the page rank algorithm, which establishes a global ranking based on links to and from web pages, simulates a random browsing of internet and estimates the likelihood that a user, by random navigation, will arrive at a particular page. While such methods can provide a global score of potential user viewing behavior, it does not take into account the cognitive aspect of content and knowledge associated with web pages. Kleinberg's hub and authority method (HIT) (including variations) have been used to attribute local importance to pages. However, the HIT type of approaches requires an initial query result against which the relevance of the related pages in World Wide Web or document databases can be measured. One of the primary drawbacks of this method is that it has to be carried out in real time; i.e. after a query has been submitted made and a set of results obtained, the algorithm attempts to crawl the neighborhood of these results in real-time to find hubs and important pages. Moreover these methods do not detect cases where a node has exerted “undue influence” on the computation of hub scores, and documents in a community hub, i.e., the relevant documents, are not ranked.
The most popular method of context creation is through manual grouping of relevant content that can be manifested in directories or any other manual link and content listing. However, these listings can quickly become obsolete due to the dynamic nature of the World Wide Web and the scale of data can also render manual editing of data impracticable or inefficient.
Current search engine and information retrieval systems allow users to personalize and share their keyword searches within their online social/professional networks, and also enable users to add tags and additional information on search results. Often these results are summarized as single URLs or via access to a manually generated list of relevant links. This approach has many fundamental problems, namely knowledge is mostly a collection of related documents and contents, and not single documents and contents. Moreover, the same content can appear in multiple knowledge bases with different relevancies, and a knowledge base may comprise a dynamic set of documents, i.e., a document over time may lose its importance to a knowledge context and new documents may become more relevant. Thus, static lists of documents provide an inefficient means for sharing information. People would prefer to share knowledge and not keywords or single documents.
In conventional systems, most documents publicly available through World Wide Web are accessible without security constraints on sharing. Since knowledge can be characterized as collections of documents relevant to a topic or concept, documents and contents in this collection become information. Thus, when a specific set of documents is put to use in a specific context, (i.e. knowledge in context), such knowledge may become sensitive and require the implementation of access security constraints.
Current internet search engines use keyword-based advertisement subscriptions, auctioning, and click through cost, where monetizing is centered on auctioning of keywords; advertisements for a relevant keyword are ranked and displayed according to associated bid amounts. However, the highest bid advertisements are frequently not the most relevant to a user's intended interests because keywords generally cannot properly represent the context of a user's intentions and keywords cannot represent products and services provided by an advertiser. As an example, a user entering a query including the word “virus” may be seeking information on types of organic virus (such as a flu virus) but will almost exclusively obtain advertisements from computer virus protection companies. In another example, a user looking for RFID will only get RFID hardware related advertisement, since hardware vendors are the high bidders for RFID keyword, although most users may be looking for RFID integrators and software solutions. Such keyword-based advertisements would result in lower click-through rates and lower-quality advertisements and, consequently, advertisers may pay for click-through without receiving the benefit of corresponding sells.