The volume of textual data has increased due to the prevalence of internet use. This textual data is in the form of discussion forums, customer reviews, social media feeds, contact center records, support tickets, conversations in collaboration solutions, event logs, etc. In some cases, this textual data can have several thousands of data points for a given subject. For example, it is common to see dozens, hundreds or even thousands of online reviews of a product. Similarly, there may be dozens of discussions for a single support ticket.
This increasing volume of textual data makes it difficult to make good sense of the textual data against different dimensions by just reading or observing the textual information. It is difficult to extract information from a textual data stream that is particularly valuable to the features and dimensions that are of interest to an observer. For example, from just a stream of textual reviews and ratings of a camera, is it difficult to identify how the reviews relate to travelers, experienced photographers, or camera size. Similarly, within an enterprise collaboration tool, it is difficult to identify the key items discussed in a discussion thread.