The amount of digitally available data is growing at an ever increasing rate. Every time someone posts a tweet, sends an email, drafts a memo, updates a stock quote, or publishes a news article, they are creating a new digital document. Current approaches to the organization and analysis of this overwhelming body of documents can involve anything from the use of a conventional search engine, to advanced applications for visualization of the frequency of words and phrases across a set of documents.
In many cases this kind of analysis can require that a researcher manually extract and collate information from the identified information sources. Tools are available to help researchers identify subsets within their data, but these often require advanced knowledge of programming or database query languages. Researchers are often relegated to reading through many individual documents to gain an understanding about trends and sentiment groupings. This can be a very time-consuming process.
Moreover, even with a significant investment of time, a researcher can be left with an incomplete picture of interrelations. While conventional search engines can quickly retrieve documents for a topic defined by a researcher, and data collation programs can collate representations once a researcher has created well-defined queries, these techniques generally require researchers to know exactly what they are looking for and specify relevant search and analysis parameters.