A variety of techniques exist for searching a data set, given a query. One technique is to locate identical copies of the query in the data set. Unfortunately, if only identical matches are located and returned, the user will not be presented with similar, but non-identical results that might have been desirable to the user. For example, if a user provides as input ten particular words, and only identical matches are returned, documents in which nine of the ten words are present will not be returned, even though the user would likely consider such results of interest. It is also possible that all ten words will be present but in a different order from the one in the given query or with other words intermixed. In such cases the user may also not be presented with such matches, even though they may be relevant.
Another technique is to search the data set for keywords in the query. One problem with keyword searches, however, is that if the query includes common words, an overwhelming number of results may be returned to the user.