Information Retrieval (IR) is concerned with locating desired elements of information among a large corpus. A search engine is a one example of an IR system that enables documents (usually but not necessarily limited to text) to be retrieved from a large corpus on the basis of their degree of relevance with respect to a compact query presented by a user. The order in which documents are retrieved or presented is the ranking created by the search engine: the highest ranked documents, with respect to the query, are returned or presented first. Search engine ranking may be affected by both query-dependent and query-independent criteria. Query-dependent criteria generally attempt to identify the degree to which a document is semantically related to the query. An example is the correspondence of word frequency distributions. Query-independent criteria often seek to identify the degree to which a document is generally “good”, e.g. authoritative, intelligible, not fraudulent or not deceptive. An example of a query-independent criterion is the score computed by the PageRank algorithm, or similar algorithms that examine the link structure of a corpus of documents.
As mentioned above, query-independent criteria can provide a way of measuring the authoritativeness of a specific information source. For example, the more information sources that point to a specific information source, the higher the search rating score the information source gets, and the more authoritative it is judged to be. In some instances, the search rating algorithm is recursive, meaning that a information source's search rating score is based not only on the number of information sources that reference the original information source, but also on the search rating scores of the referencing information sources. In other words, the search rating score of an information source is based on both the number and quality of the referencing information sources.
For some information sources, all of the content is under the control of a single agent. In such cases, the reputation of the agent can be directly correlated with the content of the information source. In other cases, however, control may be delegated among several agents, each controlling a partition of the information source. To the extent that these partitions can be identified, agent reputation can be calculated at the partition level.
In general, however, it is difficult to correlate content on an information source with the agents responsible for creating or publishing that content. For example, an individual author may contribute content to multiple information sources, content within a single information source may originate from multiple agents, or ownership and control of information sources may change over time. As another example, a single web page can contain content controlled by multiple agents, such as advertisements which appear alongside news articles.