In some query systems, an upper limit is placed on the number of results that may be provided in response to a query. A limit may also be placed on the number of results that are retrieved at intermediate levels or stages of processing a search. So for example at a first stage, a search query might be processed until there are for example, 500, 10,000, 50,000, 200,000 or some other maximum number of results. After the results reach the maximum cutoff number, then searching stops. This allows the search to be completed in less time and reduces the demands on the search system. Since a user will rarely wish to see even 200 results, 50,000 may be a safe maximum. At a second stage of a search, the results may be reduced again to, for example, the top 10,000.
A search index is one type of search data structure used for servicing queries for a given organization or database. A search index may sometimes be broken up into partitions or shards for large organizations or large databases. As more documents are added to a search index, it becomes more likely that queries against that index will bump up against a 50 k, a 10 k or any other reasonable collection limits. The collection limits are intended to cause the search results to be truncated. However, in some cases the truncation will lead to less relevant end-user results. Placing any upper limit on the number of results can prevent the search from retrieving the results that the user is looking for.
A problem arises if the search is limited at one stage and then further filtered or post-filtered at another stage. It can happen that the search returns no results, because all of the collected results are later filtered out. In other cases, in the final results there may be only a few hits or only low ranking or poor hits.