It is now a commonplace observation that ordinary searchers will often run searches that yield intractably large results sets. Search engines have tried to resolve that problem by ranking the results, and in some cases by trying to limit the size of the data set upon which the search is run.
The exact ranking algorithms are almost always kept as trade secrets, but it suffices to say that no given ranking algorithm will be right for all searchers. One person searching the Internet for “poodle” might be looking to buy a dog, and another might be looking for obscure articles on the tendency of miniature poodles to have bad teeth. Thus, all ranking systems are necessarily inadequate, regardless of whether they rank by popularity of web pages, the length of time that prior searchers viewed a web page, the number of times a search term occurs in the text, the amount of text on the page, or by any other system.
For similar reasons, efforts to limit the size of the data set upon which the search is run are helpful in some circumstances, but are by no means a panacea. Google™, for example, allows searchers to run queries against records sets in various groups, such as Arts and Entertainment, Business and Finance, Computers, Health, Home, News and so forth. But such groups are often underinclusive, overinclusive or both, and in any event are useful mostly to searchers having only the most rudimentary searching skills, or simplistic searching needs.
A significant problem that has not adequately been addressed is the laser beam nature of a search request. A search for records having two or three keywords will identify precisely those records having those keywords, and nothing else. Yes, some systems are sophisticated enough to expand the search to include non-standard plurals (i.e., search for “women” when the searcher entered “woman”) and even related terms (female, girl, etc). But then the searches are still performed on those expanded terms. The whole process is bit like someone looking around in a darkened room with a laser pointer. What they really need is a search beam that provides perspective on what surrounds the center of the beam.
All of the major search engines do show snippets of text surrounding the query terms, which provides users some guidance on how they might narrow their searches. But in order to adequately use that contextual information, users are forced to search through page after page of text extractions to identify additional terms that might be of interest. That is just a colossal waste of time.
Ask Jeeves™ has long sought to resolve that problem by suggesting additional terms with which to narrow a search. For example, in response to the term “insurance”, Ask Jeeves™ identifies over a hundred million hits, but then also suggests forty-seven subsets that result in narrower searches. Suggested subsets to the “insurance” search include Car Insurance, Health Insurance, Insurance Companies, Homeowners Insurance, Travel Insurance, and so forth. If one then selects “Car Insurance”, Ask Jeeves™ suggests forty further subsets, including for example, Car Insurance Quotes, Car Insurance for Woman, AA Car Insurance, Motor Insurance, and Budget Car Insurance.
In some instances suggesting additional subsets may well prove helpful. But as the target record set against which the search is queried grows ever larger, even the subsets become intractable. Selecting “Car Insurance Quotes” gives almost six million hits. Drilling down further, one could select “Car Insurance for Woman”, but that selection still gives more than two million hits. In addition, it is impossible for search engines to store subset suggestions for all possible searches. For example, in response to the search “poodle telephone”, Ask Jeeves™ identifies over 250,000 web pages, but doesn't make a single suggestion as to narrowing the search.
The underlying problem is that users have no way of gaining a broad understanding of the context in which the search terms are used throughout the entire (or even significant portions) of the target record set. The most any searcher will likely do is review 100 or so text extractions, and that just isn't enough of a search beam to identify all or even most of the nearby terms that might be of interest, or to gain an understanding of how often or in what proximity other terms might be to the original search terms. And without that information the searcher is forced to view the database with tunnel vision, trying out perhaps several dozen different combinations in the hope that he would hit upon a combination of search terms that is neither terribly over-inclusive nor terribly under-inclusive.
Thus, what is still needed are systems and methods the provide summary context information for searches.