The present invention relates to the field of search engines. More particularly, one embodiment of the present invention provides a system and method for improving a user search experience by identifying and reordering search results based on measures of business identifiability and integrity.
Internet search engine technology is now more than a decade old. Search engines began as a form of straightforward indexing. Today, they reflect an ongoing battle between “search engine optimization” companies trying to promote products and services, and search companies trying to produce search results resistant to attempts to “game” the search system.
Existing search engines for web pages use, primarily, information from web pages themselves, including the links between them, to determine which search results should be presented first. This is referred to herein as “intrinsic” information. In contrast, as referred to herein, “extrinsic” information is obtained from sources other than the web pages being indexed. Some extrinsic sources have substantially higher information reliability than intrinsic sources, and can potentially be used to validate information from the web pages themselves.
There have been tentative steps towards the use of extrinsic information by others, typically involving “blacklists” of “bad” web sites and “white lists” of “good” ones. The criteria for such lists have typically been ideological, rather than being based on an evaluation of the business. Such lists have typically covered only a small fraction of the existing web sites, and typically require considerable manual maintenance and attention. Such approaches have generally been deployed as site-blocking tools, rather than being used as a component of a search system. Lin, U.S. Pat. No. 7,082,429 discloses a system using lists of web sites.
Schemes for “spam filtering” in search engines have also been proposed, but these typically borrow from the recognition techniques for obvious spam used for spam filtering. Such approaches are susceptible to the same techniques which are in widespread use to evade spam filters. Some such approaches require search users to manually identify “spam,” rather than performing the task automatically. Brewer, U.S. Patent Application No. 20060248072 discloses such an approach.
Schemes for allowing Internet users to vote on site ratings have been tried repeatedly, but such methods require much active effort by users and are susceptible to “ballot stuffing.” Manual rating efforts and systems requiring user feedback or user surveys create demands on the end user's time which are not commercially competitive outside of narrow areas, such as hotel and restaurant ratings. The disclosures in Sundaresan, U.S. Pat. Nos. 7,080,064 and 7,099,859, which rely on a “ranking system for receiving any of users' (off-line or on-line) surveys or feedback about businesses,” are examples of that approach. In contrast, the various embodiments of the present invention do not require the search system to conduct surveys or solicit user feedback.
Some specialized systems exist for the automated evaluation of intellectual property portfolios, typically by searching multiple databases for potential infringement, but this is a specialized application which relies on the formal structure of patent documents and claims, and is an unsuitable approach for general web search. Adler, U.S. Patent Application No. 20060173920 and Poltorak, U.S. Patent Application No. 20040158559 disclose such systems.
The various embodiments of the present invention overcome many of these limitations by automatically performing “due diligence” on Internet domains or web sites, using extrinsic data sources which are difficult for non-legitimate businesses to manipulate.