When a user types a query to a search engine, he or she often receives multiple documents (or “hits”) that satisfy or partially satisfy the query. This is referred to herein as information redundancy. For example, if the user queries the Internet for “Abraham Lincoln,” they will often get a large number of documents and a great deal of content overlap. Data may be returned unrelated to the requested information; for example, some pages might have nothing to do with the president. There may be a passage “. . . my cat, whose name is Abraham Lincoln . . . ” and a theme park called the “. . . the Abraham Lincoln Theme Park” or a website selling “Abraham Lincoln memorabilia,” and so forth. This content overlap is called information redundancy. Generally, information redundancy is annoying and time consuming to the user, since the user may need to read the same information multiple times in multiple different documents. Once the user finds a particular piece of information in one document, time should not be wasted reviewing the same information in many other documents.
What is needed to overcome the shortcomings in the prior art is to provide a more efficient means for providing search results to a user.