1. Field of the Invention
The present invention relates to systems used to find and present information from multiple sources, and more particularly, to systems that find information on the Internet from suppliers or purchasers of goods, services, or commodities and present that information to potential purchasers or suppliers performing comparisons.
2. Description of Related Art
Since the conception of the Internet and extending through the development of Hypertext Transfer Protocol (HTTP) and the World Wide Web (web) to the present, one of the biggest barriers to people taking full advantage of the capabilities offered by the Internet is the difficulty in sifting through the available information to find the desired information. Currently, there are many different search systems available on the Internet. The broad categories of search systems include systems that address very narrow collections of data, systems that operate by first building a local database that describes the contents of the searched web sites, and systems that target a specific type of data. There are a number of ways in which these systems differ, such as the range of information they attempt to search, the technical mechanisms that they use to search, the user interface they provide for specifying the desired data, the user communities to whom they are available, the way they are marketed, and the business models that they are designed to support.
An example of search systems that address very narrow collections of data are the “captive” search systems that are built into/for individual web sites, and allow users of the web site to find desired information within the specific site. In general, there are useful implementations of these systems available, often having user interfaces that can be customized to reflect the contents of the site. However, these search systems are usually not helpful in performing comparisons because individual web sites are typically maintained by individual companies, so the same search operation does not return comparable data.
A typical search engine that purports to search the entire web (that is to say, HTTP servers, which is a subset of the entire Internet) operates by first building a local database that describes the contents of the searched web sites, and then searches that database in response to user queries. Search systems of this type differ primarily in the way they determine which pages of data from which sites are to be added to the database, and in how the database is managed and condensed, as it is impractical in most cases to keep an entire copy of the search range on the search system. Systems of this type typically repeat the process of gathering data from the Internet periodically in order to update the local database so that it accurately reflects the contents of the various web sites searched.
Search systems that target a specific type of data operate like the systems that address very narrow collections of data and the systems that operate by building a local database in that they must gather data from the Internet before users can make requests of the search system. However, the data gathered is generally filtered to determine if it is the desired type. This can either be done implicitly by the search system operators manually creating a list of the web sites that should be searched, or explicitly by an automated portion of the search system. Most existing comparison shopping search systems work in this way.
Another aspect of existing Internet search practice is the technique of processing individual web pages using automated systems to extract desired data, where the web pages typically include HTML source text and are intended to be presented to a human user. To an extent, this technique is used by the systems that operate by building a local database and the systems that target a specific type of data because they have to differentiate HTML formatting directives from text content that is to be searched and from the URLs of other referenced Internet objects that may be the target of subsequent database building.
However, the more detailed and specific process of analyzing a web page for a particular piece or type of data, often referred to as scraping, is not employed by most search systems. There are many systems, both for searching and for other purposes, that employ scraping. However, many scraping implementations have less-than-desirable performance and/or search characteristics and are unsatisfactory for applications in which scraping would otherwise be a viable technique to employ.
Additionally, most existing systems that perform scraping are very limited in the web site structures that they support. For example, some web servers require that the accessor, typically a user, reach a page by passing through a series of other pages. In this type of web site, the content of a page depends not only on its URL but also on prior history, the page location within a framed page, page content that is generated dynamically (such as by a client-interpreted embedded language like JavaScript), and cookies set from the server. Most of these sites cannot be accessed by traditional scraping systems because the systems cannot process a sequence of pages or fully emulate all of the browser functionality required by some pages.
Consequently, there is a need for a system that efficiently gathers and evaluates information from multiple electronic sources and presents relevant information to potential buyers, sellers, or traders. This information includes, but is not limited to, information regarding goods, services, and commodities.