Text analytics on large document sets has increased in demand due to the increasing availability of big data. Documents included in the large document sets typically include facets that characterize a document's contents by including category labels, characteristic words, and tag information. The text analytics usually involves counting the facets in real-time for many different combinations of documents in the large document sets.
To perform facet counting, a computer system identifies documents that match a query, and then determines a number of facets, typically on a per facet category basis, that are included in the set of matching documents. The computer system then provides the results in an order based on the amount of facets found in each facet category. Depending on the size of the document set, the computer system may utilize a significant amount of resource to comb through each document in an effort to achieve a high degree of facet counting accuracy.