Conventional search engines are capable of searching extremely large collections of information, such as the world wide web or very large databases. As the size of data collections to be searched grows, it is no longer enough to correctly return query results that match query terms entered by a user. Instead, it is desirable to provide a mechanism to help the user sort through the large amount of data returned from a search.
Several conventional search engines currently use various methods to organize the data returned in query result. The goal of such an organization method is to decide which query result will most interest the user. Conventional search engines generally use a variety of techniques to prioritize the results of a search, but these techniques are not ideal because they must make assumptions about the type of information for which the user is searching. For example, if the user enters “jobs” he might be searching for job postings, information of Steve Jobs, job statistics for a particular country, or any number of other items. Thus, when using a conventional search engine, a user would not enter just “jobs” as a query term. He would probably also enter additional query terms that narrowed the search. Unfortunately, he may also miss relevant listings that do not contain the narrowing terms.
Currently, it is difficult to search over different types of data that may or may not be stored on the world wide web. Conventional search engines usually operate on data from only a few sources. For example, web-based search engines traditionally allow a user to search pages on the world wide web. Web-search engines often have a “back-end” that indexes the collection of information in order to make it searchable. For example, web-based search engines periodically crawl the world wide web and create indices of the pages and sites crawled. Other search engines allow a user to search existing databases. Such search engines rely on a predetermined organization of the database. For example, if a database has known fields and attributes, the user can search within those attributes. For example, XML databases only accept well-formed XML inputs. If the data to be searched is not so-organized, XML databases are generally not able to accept the data or organize the data for search.
Other search engines allow a user to search databases or to search text documents having a flat organization. Such search engines must know about the organization of the database and the organization of the documents within it. The variety of locations and formats in which data are stored means that users must often search in multiple locations in multiple databases to find the information that they need.
It would be desirable for a collection of documents to be searchable via a web-based search engine and thus easily accessible to most people while, at the same time, containing a variety of types of documents and formats of data. Moreover, it would be desirable if the searchable collections of documents were organized in ways that could help users fine-tune their searches.