Prior to the Internet, the vast bulk of informational and entertainment media consumed by ordinary people was produced by professionals: journalists, musicians, authors, reporters, actors, editors and the like. The high cost and limited capacity of the media acted to encourage a minimum level of quality by dissuading those who could not recover their costs through royalties or advertising.
The world wide web has made publication of nearly any form of text, audio, or visual media cheap and easy for anyone who can afford a computer and basic Internet connectivity. It is often difficult to identify high quality content within the resulting enormous volume of available, variable-quality content. Also, the relative anonymity of the Internet allows some of its users to degrade the experience of others by posting irrelevant, inflammatory, or misleading statements.
Outright censorship, though it may in some situation be effective and justified (such as deleting spam from an on-line forum), is unsatisfactory as a general solution to improving the overall quality of the Internet both because there is no consensus as to what ought to be censored and because no single entity has authority over the content of the entire Internet (or even its most popular application, the World Wide Web).
An alternative to censorship is to assign some sort of score to each document, so that users can distinguish (to a reasonable approximation) the quality of content without close inspection. Rating everything on the Internet would seem to be a daunting task; however Page et. al. [1] observed that the job is considerably easier if we take advanage of the fact that documents often link to each other (especially so on the world wide web) and that, though each document may only contain information about a handful of other documents, when viewed together the links form a directed graph that has a very large well-connected component, and we can make inferences about the quality of documents within that well-connected component by the structure of their links to the whole.
PageRank[1] has been employed in this fashion by Google to make high quality content easier to find on the world wide web by ranking the relative importance of web pages and displaying higher-ranked pages more prominently within their seach engine. We find it important to note, though, that PageRank may be applied to other entities besides collections of web pages, or even documents. For instance, it is particularly well suited to social networks, in which each node in the graph represents a particular user, and links between nodes represent trust relationships. One of the great strengths of the Internet is that large groups of relatively anonymous people can work together to achieve a common purpose, whether writing open-source software or playing multiplayer online games. However, In large communitities, it can be difficult to identify contributing members, but we can apply computational tools to extract this information from a collection of opinions that individual uses hold of each other.
We also at this point would like to distinguish PageRank, the present invention, and similar reputation systems from a similar category of methods, recommender systems. Grouplens is an example of the latter[2]. The input to a recommender system is a collection of ratings of the form “Entity X gives entity Y a rating of N”, in which X is typically a user of the system and Y is typically anything but a user (such as a song, a book, or a movie), and N is some numerical or symbolic value (most commonly 1,2,3,4,5 or “thumbs up/thumbs down”). Recommender systems find users that share similar opinions as a given user, and thus provide recommendations for that user (“people who like this thing also like these other things . . . ”). However, they do not provide a mechanism for users to directly rate other users, and this limitation makes them relatively vulnerable to ballot-stuffing attacks [3].