Users of information systems are frequently faced with the problem of information overload. They come upon a collection of content, be it a document repository, video collection, news feed, or search results, and are overwhelmed by the sheer number of items to consider, and by their lack of a priori knowledge of what the domain of content contains. Tools to search or filter the content are only helpful to the extent that the user has some idea of what sorts of things to search for or what aspects to filter in or out.
In more structured domains, tools exist that permit a faceted search, where users can, for example, filter a set of products by a variety of dimensions, such as manufacturer, price, power, customer ratings, etc. These tools become less valuable with collections of unstructured content.
The problem of exploration and navigation of a document collection (or corpus) is a longstanding one. Some conventional systems provide a network graph of topics. Other conventional systems provide a dimensional reduction on documents based on their contents and plot them in a two-dimensional space. Yet other conventional systems provide a word cloud to represent the contents of a document or set of documents.