Any discussion of documents, devices, acts or knowledge in this specification is included to explain the context of the invention. It should not be taken as an admission that any of the material forms a part of the prior art base or the common general knowledge in the relevant art in Australia or elsewhere on or before the priority date of the disclosure herein.
Existing web search services include those of Google, Yahoo, and Live Search, to name but a few. Such search services (known as ‘search engines’) generally entail centralised databases (located server-side) which index billions of web-pages, so that users can search those databases for the information they desire through the use of keywords and/or operators.
Results to search queries submitted via such search engines are ordered using numerous criteria, including, for example, the concept of link popularity, whereby the number of other websites and/or web-pages that link to a given page is/are taken into consideration on the premise that good or desirable pages are linked to many others, to produce the “page rank”. The page rank of linking pages and the number of links on those pages can also contribute to the page rank of the linked page, as described further in, for example, U.S. Pat. Nos. 6,285,999, 6,799,176, and 7,058,628. Because criteria such as link popularity is used by existing search engines in order to determine the relevancy and/or order of display of the search results obtained, it can be said that a form of subjective bias is introduced into the searching process. Hence, the search results obtained may not reflect the true results available, but instead those deemed to be relevant by a search engine's popularity algorithms, etc.
Index search engines are run on one or more server farms, usually consisting of thousands of low-cost commodity computers. It has been estimated that over 450,000 servers, racked up in clusters located in data centres around the world, are required to provide such a database indexed web search engine service. Despite such immense computing resources, the ranging “bot” or “spider” applications used to continuously crawl the Internet and update the database index cannot track updates to the web in ‘real-time’.
Live Search (formerly Windows Live Search) is the name of Microsoft's web search engine. Despite its name, this service still relies on web-crawlers to build a centralised index which in effect provides a snapshot of what each portion of the Internet looked like the most recent time the web-crawler visited, and is thus not a true ‘live’ search. The Live Search search engine index is believed to include more than 5 billion documents, 400 million images and 3 million instant answers.
With the significant growth and changes occurring to content available on the Internet, the problem of trying to maintain a current index of the entire web is growing in magnitude. Current solutions involve substantial centralised infrastructure that attempt to store the entire contents of the Internet as an index. Even so, information is gathered from websites and indexed only intermittently. Hence, results indexed can become weeks or months old before being updated by a subsequent visit by a web-crawler.
A further problem with centralised database indexes is the proliferation of dynamically generated websites comprising active pages, such as eBay, among many others. An extremely high proportion of Internet content is now generated dynamically. The content of such sites can be highly volatile and can vary dramatically from one visit to the next. This only serves to exacerbate problems of currency of database indexes in respect of dynamically generated websites.
Meta-search engines such as Dogpile.com or OpenSearch.org are sometimes misconceived as being ‘real-time’ search engines. This is not the case at all. In fact, meta-search engines only pass through data representing a search query to other search engines utilising index databases in order to return search results obtained from those databases. Hence, meta-search engines do not actually physically perform a search at all, instead, they merely act as a display portal for a select few hosts and provide the same results of their hosts. In this way, meta-search engines create what is known as a virtual database, transparently integrating multiple database systems into a single view. Meta-search engines cannot search outside of the predetermined host(s) or host list (i.e. they only pass through results from a predetermined small list of hosts and provide only a single first stage layer of results from those predetermined sites). Meta-search engines are not able to adapt and learn how to search new hosts on the fly. Furthermore, as meta-search engines simply display results obtained from other index-type search engines, the results they display can also be biased by ranking and relevancy algorithms.
It is therefore an object of the present invention to provide an improved method and/or system for searching network content, preferably the Internet.