A topic model—which is a probabilistic model for unlabeled data—may be used for the automatic and unsupervised discovery of topics in unlabeled data, such as a set of textual documents. Such a topic model is designed with the underlying assumption that words belong to sets of topics, where a topic is a set of words. For example, given a set of scientific papers, a topic model can be used to discover words that occur together (and therefore can be clustered under the same a topic). One topic could include words such as “neuroscience” and “synapse”, while another topic could include words such as “graviton” and “boson”.
Topic modeling has many applications in natural language processing. For example, topic modeling can be a key part of text analytics such as Name Entity Recognition, Part-of-Speech Tagging, retrieval of information for search engines, etc. The automatic and unsupervised discovery of topics in unlabeled data may be used to improve the performance of various kinds of classifiers (such as sentiment analysis) and natural language processing applications.
Topic modeling being unsupervised is both a blessing and a curse. It is a blessing because good labeled data is a scarce resource, so improving tools that depend on labeled data by extracting knowledge from the vast amounts of unlabeled data is very useful. It is a curse because the methods used to discover topics are generally computationally intensive, and topic modeling often needs to be applied on significant amounts of data, sometimes under time constraints.
Given the considerable computational potential of latest editions of highly-parallel architectures and their potential for even more computational power, it is tempting to choose such architectures to perform topic modeling. Further, topic modeling can be performed even more quickly when performed by a distributed system of computing devices with GPUs. Dividing the topic modeling into tasks for the nodes in a distributed system to perform combines the computing power of the multiple nodes, which can speed up the topic modeling. However, splitting up the topic modeling tasks among computing devices introduces the need for inter-device communication, which is very slow compared to a GPU's processing speed and which provides a significant hurdle in efficiently implementing a distributed topic modeling algorithm.
As such, it would be beneficial to implement a topic modeling algorithm that is highly data-parallel, and that effectively manages memory and communication bandwidth in order to efficiently perform a parallelized topic modeling on a distributed system.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.