1. Field of Invention
This invention relates to search systems for distributed networks.
2. Description of Related Art
A plethora of available xe2x80x9csearch enginesxe2x80x9d are available on the Internet for locating information about a particular topic. Specifically, a user, after typing in a Uniform Resource Locator (URL) of a xe2x80x9csearch engine,xe2x80x9d for example, Yahoo(copyright), Infoseek(copyright), Lycos(copyright) or AltaVista(copyright), will typically arrive at a screen at which the user can enter one or more keywords. These keywords generally correspond to a distillation of the important concepts pertaining to the particular piece of information the user is seeking. Upon entering these keywords, and pressing the xe2x80x9csearchxe2x80x9d button, for example, with the click of a mouse, the user is returned a result list of information sources or xe2x80x9chitsxe2x80x9d which the search engine found in its index and determined to be relevant to the user""s query.
The user then typically scans the result list determining which of the particular results is most relevant. The user then can click on a result, or a xe2x80x9chit,xe2x80x9d and be taken, via hyperlink, to the actual information source, e.g., web page, that corresponds to the hit.
Once at the web page, the user can then browse the page looking for the specific information item that corresponds to the submitted query. Upon completion of the review of this particular web page, a user generally presses the xe2x80x9cbackxe2x80x9d button on their browser interface to return to the result page generated by the search engine. The user then again selects a result and follows that result""s hyperlink in the same manner as described above. This process continues until the user locates the desired information.
Existing search engines are fast and produce ranked results. However, the accuracy of their ranked results is based on the internal indices generated at the specific search engine. If the indices are not routinely maintained, incomplete indices produce inaccurate results, the indices may contain broken links to web pages that may have moved location and the indices may be missing links that have been updated since the last regeneration of the index.
Furthermore, existing search engines do not take into account the user""s current context, e.g. the current virtual location that the user is browsing. Accordingly, if a user wants to find information within the currently viewed web site about a particular topic, the user must choose from five options. First, the user can use a global search engine and supplement the query with words that are likely to be associated with the current web site, e.g., the name of the company to which the web site belongs. This requires expertise on behalf of the user and is not guaranteed to produce only results from the site in question. For example, in an exemplary index based search engine, such as Yahoo(copyright), AltaVista(copyright) or Excite(copyright), the search engine receives the user""s input keyword. This input keyword or words is then compared to the search engine""s index. A correlation is then made between the keyword and the frequency of occurrence within the index. This correlation produces a result list that can then be organized, or ranked, based on this correlation.
Second, the user can perform an xe2x80x9cadvanced searchxe2x80x9d at some global search engine and specify that results must be from the current web site. In this case, the results will indeed be guaranteed to come from the site in question, but the user may not receive a satisfactory set of results due to the incompleteness and staleness of most search engine indices. In addition, this type of search requires expertise on the part of the user.
Third, the user can look for a locally provided search interface on the web site itself. The locally provided search interface may be hard to find, i.e. not available at the current location the user is browsing, it may have an idiosyncratic syntax and it may not be up-to-date.
Fourth, the user can manually browse the site searching for specific information. At a complex site, this could be time consuming and error prone.
Finally, the user can contact the administrator of the web site. This is a slow process, is not always possible and may not produce any results.
The systems and methods of this invention enable a user to perform a search more easily by combining index searching and crawl-based searching. Furthermore, the systems and methods of this invention enable context information to be included with either or both of the index search and the crawl search to further refine the scope of the search. Specifically, by recognizing the user""s current context, e.g., virtual location or Uniform Resource Locator (URL), by performing a contextualized index search on behalf of the user, and by performing a contextualized crawl looking for results that match the user""s query, this invention provides a non-expert user with localized search results in a timely and comprehensive fashion.
Specifically, in a crawl type search, a combination of keywords, context and boundary information are used to conduct a search within a specified area of a distributed network. Since this approach operates in real-time or near real-time, a number of the drawbacks encountered with an index type search are overcome.
The systems and methods of this invention combine index type searching and crawl type searching.
This invention separately provides systems and methods for assisting users in conducting a search of one or more distributed networks.
This invention separately provides systems and methods that allow a user to interface with a search tool via a user interface.
This invention separately provides systems and methods that allow users to customize search strategies to be applied to one or more distributed networks.
The search systems and methods of this invention use a combination of index based search strategies, crawl based search strategies and context information to provide a comprehensive lists of results to a user. In particular, a user enters one or more keywords corresponding to information on a desired topic. The systems and methods of this invention receive the query and perform, either serially or in parallel, an index search of a preexisting index and a crawl search within a particular context. The results of these queries are then assembled and displayed to the user. Thus, the results displayed to the user are comprehensive and the combination of the two queries complement each other in overcoming their individual shortcomings.
These and other features and advantages of this invention are described in or are apparent from the following detailed description of the preferred embodiments.