Text clustering is a technique of organizing and categorizing text information often used in text mining and natural language processing. Text clustering groups together similar text documents so the documents can be further analyzed. Traditional text clustering methods, such as Latent Dirichlet Allocation (LDA), hierarchical text clustering or K-means text clustering, usually suffer from certain drawbacks. First, the traditional methods require users to provide a predetermined number of groupings of similar text documents. However, users normally do not know how many categories or groupings exist in the data. Second, the traditional methods typically rely on an inefficient and often inaccurate document similarity measurement. In those methods, typically, documents in a data set are compared against each other to determine how similar they are based on how many overlapping words exist between each pair of documents. These traditional text clustering techniques are typically ineffective for short texts and can result in inaccuracies. For example, online comments, reviews, or survey responses often have only a few sentences. It is quite common for different words to be used to express the same concepts or topics, making the results of traditional text clustering methods unreliable. For example, “the employees were friendly” expresses a similar concept/topic as “the staff was accommodating”; however, because the keywords in these sentences (“employees” and “friendly” in the first sentence and “staff” and “accommodating” in the second sentence) are different, traditional text clustering techniques would not classify these two sentences as being similar and they are unlikely grouped into the same cluster. Further, given the large volumes of short texts such as online comments and reviews, there is a growing need for automated analysis of such texts to provide reports, feedback, etc. to the end user. A more accurate text clustering technique is therefore needed.