Classification of documents with respect to a document class is a common task in data processing, with broad application in business intelligence, web-searching, etc. For instance, classification of documents may be performed to find documents in the Internet in response to a search query placed through a search engine. One conventional way of classifying documents has been through use of a simple Boolean logic, where documents are determined to be in a class of documents or not according to the Boolean logic. However, use of Boolean logic often requires a relatively complicated command in retrieving desired documents and often does not have a desired specificity in searching logics.
Other examples of conventional ways of classifying documents are use of: a naive Bayes' classifier, which is a simple probabilistic classifier based on applying Bayes' theorem with strong-independence assumptions; a decision tree, which is a decision support tool that uses a graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility; and support vector machines (SVMs), which are a set of related supervised learning methods used for classification and regression.
However, while the above-described conventional ways of classifying documents have sometimes proven to be successful, there is still a need for a classification method that is simpler than and achieves comparable results to conventional classification methods.