Through the Internet and other networks, users have gained access to large amounts of information distributed over a large number of computers. In order to access the vast amounts of information, users typically implement a user browser to access a search engine. The search engine responds to an input user query by returning one or more sources of information available over the Internet or other network.
The search engine typically performs two functions including (1) finding matching results and (2) scoring the matching results to determine a display order. The search engines typically order or rank the results based on the similarity between the terms found in the accessed information sources to the terms input by the user. Results that show identical words and word order with the request input by the user are given a high rank and will be placed near the top of the list presented to the user.
Scoring performed by different search engines takes into account various factors including whether a match was found in the title, the importance of the match, the importance of a phrase match, and other factors determined by the search engine. Parameters that work well for one kind of search may not work well for all searches and parameters that work for some users may not work well for others.
Web site owners are constantly trying to manipulate search engines in order to artificially inflate their web site rankings for specific search terms. Highly monetizable terms such as “travel”, “hotel”, “Viagra”, “dvd”, etc., are spammed in order to drive traffic to the web site. The search engines may give these web sites a high ranking and never learn that the web sites are spam sites. This spamming technique can lead to an inferior user experience on average and distort the true value of a web site to the user.
The user base of searchers will generally be the best source for information pertaining to whether results are spam results. However, requests to end users to provide more feedback data have been met with limited success. The limited success stems from the fact that providing feedback is often cumbersome and time consuming for users. Furthermore, pre-configured feedback formats are often inadequate.
Additionally, in considering user feedback, a system must be able to identify feedback from spammers in order to prevent such feedback from artificially lowering rankings of competitors' websites.
User satisfaction is a critical success factor for a search engine. Spam results significantly decrease the quality of the user experience. Accordingly, a solution is needed that facilitates identification and filtering of spam results.