Currently, one of the highest-valued Internet or Web-based utilities is the basic search engine. Many of the largest Internet-only companies rely on search activity to generate their largest amount of traffic and, thus, revenues.
There are many methods for producing a set of search results from a query input to a search engine, typically via an interface of a web browser. For the sake of convenience, the individual entries in the set of results are referred to as links herein. Some of these methods are proprietary, some are secret, some are simple, and others are quite complex. However, there are two basic manners in which the links are collected, those being through the use of specialized applications that seek out webpages (known as crawlers or spiders) and through manual input or identification of the links which often occurs by a website seeking to be included in a search engine's results. In the latter of these cases, the website operators may pay a fee in order to be in a section of sponsored links, or to receive a preferred ranking within the search result list.
In response to a query, the search results are displayed on the browser of the user who input the query. While the algorithms for various search engines may vary, a common manner for selecting the search results from the databases of the engine is identifying the relevance of the search query to information contained in the webpage, such information previously having been collected. For instance, if a search query has a single word, the links produced by the engine would likely have a high usage of that word in the page to which the link directs.
Such engines are susceptible to manipulation by website operators. A website operator may generate revenue from advertising on the webpages of the website simply by virtue of hits or views of the webpage. Accordingly, an unscrupulous website operator may employ a variety of tactics to generate views of the webpage, regardless of whether the webpage is truly targeted by the user's search query.
As stated before, the engine collects information from the identified webpages. A portion of this information is legitimately related to the useful contents of the webpage, such that identification of a website based on the search query is also useful to the searching user; however, some is not. The actual forms of worthless information, such as metatags or text that does not appear to a user (which can be done by reducing font size or by matching text color with background color, or by hiding the text behind a graphic), is not important. What is important is that such practices can result in a number of the links returned in response to a search query being based on this worthless information and, thus, the webpage is of little to no use to the searching user.
Accordingly, there has been a need for improved methods and systems with which to identify the validity and utility to actual users of search results provided by an Internet-based search engine as a result of a search query.