Aspects described herein relate to document classification, and more specifically, to determining an optimal number of dimensions for constructing a singular value decomposition (SVD) topic model.
Text mining is an extension of the general notion of data mining in the area of free or semi-structured text. In comparison to data mining, text data analysis (also referred to as “text mining,” “topic modeling,” “text analytics” or simply “text analysis”) refers to the analysis of text, and may involve such functions as text summarization, information visualization, document classification, document clustering, document summarization, and document cross-referencing. Thus, text data analysis may help a knowledge worker find relationships between individual unstructured or semi-structured text documents and semantic patterns across large collections of such documents.