The present invention applies generally to computer-assisted searching, and more particularly to a method and computer program product for color coding search results as a convenience for interpreting search output that includes a large number of sources.
Internet users conduct searches to find pages on the World Wide Web that provide information about topics of interest. The topic of the search is specified by a set of keywords that the searcher enters on the input line of a web browser in response to prompts by the browser.
Some browsers permit searchers to enter not only keywords, but also requirements stated in Boolean logic that must be satisfied before a web page containing one or more of the keywords is judged to be relevant to the search. Nevertheless, having the option of using Boolean logic to specify conditions of the search is not always helpful, as many searchers find that working with Boolean logic is beyond their capabilities. Furthermore, details of the syntax for specifying Boolean logic often vary from browser to browser, and even a searcher who understands how to specify Boolean search criteria in principle maybe unable to do so in practice for a particular web browser at hand.
As a result, many searchers simply enter a few keywords on the browser""s input line and proceed with the search. Browsers often interpret such entries as requests to find web pages that contain any one or more of the keywordsxe2x80x94i.e., the default relationship among the keywords is logical union or xe2x80x9cinclusive-orxe2x80x9dxe2x80x94and therefore cast a wide net. Often, the search returns an unusably large number of sources, as a consequence of the very large number of web pages now accessible through the Internet.
To help manage the flood of information generated by an xe2x80x9cinclusive-orxe2x80x9d search, the search engine normally ranks the sources it finds according to its own rules, and presents the sources to the searcher as a list of web page links arranged from first to last according to the ranking. Unfortunately, the searcher is often unaware of the search engine""s rules. So, lacking any better way to proceed, the searcher must often access and view numerous web pages before finding any truly relevant information.
For example, a searcher might enter three keywords, the words xe2x80x9calpha,xe2x80x9d xe2x80x9cbeta,xe2x80x9d and xe2x80x9cgamma.xe2x80x9dThe search engine would then find sources that include the keyword xe2x80x9calphaxe2x80x9d in isolation, sources that include the keyword xe2x80x9cbetaxe2x80x9d in isolation, sources that include the keyword xe2x80x9cgammaxe2x80x9d in isolation, sources that include both the keywords xe2x80x9calphaxe2x80x9d and xe2x80x9cbeta,xe2x80x9d sources that include all three keywords, and so forth. It may be the searcher""s intention, however, that at least two of the three keywordsxe2x80x94or even that all three keywordsxe2x80x94should appear before a source is judged to be relevant to the search.
Nevertheless, the search engine may first present links to sources that contain only the keyword xe2x80x9cbeta,xe2x80x9d and only far down the list present links to sources that contain all three of the keywords. In other cases, the search engine might find only a single source that contains all three keywords, and put a link to this source at the top of the list. In the list, however, the first link might be followed by a large number of links to sources that are irrelevant according to the searcher""s intentions, thereby requiring the searcherxe2x80x94who is unaware that these links are to web pages that contain only one or two of the keywordsxe2x80x94to spend considerable time accessing and viewing irrelevant web pages.
More generally, with today""s technology the searcher does not always have a clear picture of which keywords occur in which of the sources found in a basic xe2x80x9cinclusive-orxe2x80x9d search. So, from the searcher""s point of view, the purpose of the searchxe2x80x94to narrow the list of sources that must be examined in order to find relevant informationxe2x80x94is effectively thwarted. As a result, the time spent on the search and the complexity of the search grow unproductively, because the searcher must often go back to the search engine with a new set of keywords or with an attempt to formulate stricter search criteria using Boolean logic.
Thus there is a need for a way of presenting the results of a search so that the searcher may form an effective picture of the relevance of the sources found by the search engine, in order that the searcher need not examine sources that lack relevant information yet appear nevertheless in the list of sources found by the search engine.
The present invention enables a searcher to see at a glance how closely the sources found by a search engine match the keywords that convey the searcher""s intended search criteria. In the case of an Internet search, the searcher""s browser prompts the searcher to enter the keywords, for example on an input line presented on a display screen of a computer. The browser reads the keywords, associates colors with the keywords to provide a color code, and displays a color code map that explains the color code to the searcher in an intuitive way. The browser then sends the keywords over the Internet to a search engine.
The search engine executes a search, and sends to the browser a set of uniform resource locators (URLs) that identify web pages purportedly relevant to the search, along with occurrence data that report, for example, whether each keyword is present or absent in each web page, or how often each of the keywords occurs in each of the web pages, either in absolute terms or in terms relative to the occurrences of other keywords.
For each URL, the browser formulates a correlation indicator. The correlation indicator includes a visual area that is colored according to the color code and the occurrence data. The browser displays links to the URLs and the associated correlation indicators to the searcher.
For example, the searcher might enter the keywords xe2x80x9ccricket,xe2x80x9d xe2x80x9cbatxe2x80x9d and xe2x80x9cCanadaxe2x80x9d on the input line. The browser might then associate the color blue with the word xe2x80x9ccricket,xe2x80x9d the color green with the word xe2x80x9cbat,xe2x80x9d and the color red with the word xe2x80x9cCanada.xe2x80x9d The browser might then display the color code map in the form of a horizontal bar that appears just below the input line, wherein the bar is colored so that the segment of the bar that appears beneath the word xe2x80x9ccricketxe2x80x9d is blue, the segment of the bar beneath the word xe2x80x9cbatxe2x80x9d is green, and the segment of the bar beneath the word xe2x80x9cCanadaxe2x80x9d is red.
In this example, the correlation indicators might also be horizontal bars. For a URL that identified a web page that included the keyword xe2x80x9ccricketxe2x80x9d but neither xe2x80x9cbatxe2x80x9d nor xe2x80x9cCanada,xe2x80x9d perhaps a web page on insects, the entire correlation indicator bar could be colored blue. The all-blue bar would alert the searcher that the keyword xe2x80x9ccricketxe2x80x9d was found but not the other keywords. For a web page that included the words xe2x80x9ccricketxe2x80x9d and xe2x80x9cbatxe2x80x9d but not the keyword xe2x80x9cCanada,xe2x80x9d perhaps a web page on sports in England, the correlation indicator bar could be colored in part blue and in part green, but without the appearance of the color red. The part-blue-part-green-absent-red bar would alert the searcher that the keywords xe2x80x9ccricketxe2x80x9d and xe2x80x9cbatxe2x80x9d were found, but not the keyword xe2x80x9cCanada.xe2x80x9d For a web page that included all three keywords, perhaps a web page on sports in Canada, the correlation indicator bar could be colored in part blue, in part green, and in part red. For a web page that included the keyword xe2x80x9cCanadaxe2x80x9d but neither xe2x80x9ccricketxe2x80x9d nor xe2x80x9cbat,xe2x80x9d the bar might be colored entirely red, and so forth.
In other embodiments of the invention, the correlation indicator may show the frequency of occurrence or the relative frequency of occurrence of each keyword in the web page identified by the URL, rather than show just the presence or absence of the keyword as described above. For example, if a web page contained the keyword xe2x80x9ccricketxe2x80x9d eight times and the keyword xe2x80x9cbatxe2x80x9d two times, eighty percent of the visual area of the correlation indicator could be blue and twenty percent green. In another embodiment, a visual area may be reserved in the correlation indicator for each of the keywords, where each of these areas is colored according to the color code in proportion to the relative frequency of occurrence of the associated keyword (or left uncolored for keywords absent from the source). In the running example here, eighty percent of the visual area in the correlation indicator associated with the keyword xe2x80x9ccricketxe2x80x9d might be colored blue (and the remaining twenty percent left uncolored or colored according to a background color not associate with any of the keywords), and twenty percent of the visual area in the correlation indicator colored green (and the remaining eighty percent left uncolored or colored according to the background). Thus the searcher may readily grasp the relevance of each source to the search by glancing at the colors of the correlation indicators, and no longer needs to rely on the positions of the sources in the list returned by the search engine.
Although the present invention is described here in the general context of an Internet search as a matter of convenience, the Internet is not a necessary condition of the invention. Rather, the invention applies to all kinds of search environments, including, for example, searches conducted locally by a workstation or terminal that has a built-in CD ROM database, and searches wherein the search engine is local to the searcher but the database is remote. These and other aspects of the present invention will be more fully appreciated when considered in the light of the following detailed description and drawings.