Computerized text categorization systems attempt to provide useful topical information about texts with limited input from humans. The computerized text categorization systems currently available use different methods to achieve this objective and consequently suffer from different drawbacks. A number of systems permit the assignment of multiple topics to a single text. These systems typically require a large number of training texts that have been pre-labeled by a human with topic codes. Other text categorization systems use document clustering and latent semantic indexing to topically categorize text. These systems do not require pre-labeled training texts. The usefulness of such systems is limited because only one topic can be assigned to any document. Still another text categorization system permits multiple topics to be assigned to each document and does not require pre-labeled training texts. This system categorizes documents using a probabilistic model generated from a list of terms representative of pre-specified categories that are provided in advance. Thus, there exists a need for a computerized text categorization system that does not require advance hand-labeling of learning text, or pre-specified categories and is capable of assigning multiple topics to a single document.