Automatic processing and analysis of electronic text documents has a wide variety of practical applications, for example, document classification, clustering, indexing, spam filtering, and the like. Generally, most textual analysis includes a text feature extraction process, which is used to determine the words or terms that occur in the electronic document. For example, a full-text indexing application may perform text feature extraction on large volumes of files or web pages. For another example, an Information Lifecycle Management (ILM) application may periodically apply classifiers to huge document repositories for content management, such as the application of automatic file retention, archiving and security policies. The text feature extraction process often uses a great deal of processing resources, particularly in large scale systems.