In document classification, a document may be assigned to one or more categories, based on its contents. A recent use of document classification techniques has been spam filtering which tries to discern E-mail spam messages from legitimate emails. Document classification tasks can be supervised, where some external mechanism, such as human feedback, provides information on the correct classification for the documents, and unsupervised, where the classification is done without reference to external information. Document classification techniques include naive Bayes classifier, latent semantic indexing, support vector machines, and approaches based on natural language processing.