Classification of documents is a classical topic in Statistics and Computer. Science, for which numerous methods exist. These methods range from the simple, such as the use of Boolean formulas to the more sophisticated, such as k-nearest neighbor, support vector machine and neural network. It is often desirable to classify documents in an environment in which the crucial features form a small number of clusters, and the accuracy is inherently limited due to the data noise. Under such circumstances, the existing classification methods require complicated modeling and learning phases. What is needed are more intuitive, flexible, and efficient methods, computer readable media and computer systems for performing such document classification tasks.
Another scenario in which document classification can be of great use is searching for relevant documents. Boolean queries are commonly used by various search engines to obtain search results. Despite its great success, the expressive power of Boolean queries are limited in that the user can only specify keywords for which to search. Thus, an important limitation of Boolean querying is it does not allow the user to specify a preference and/or context of the keywords in the search query. Therefore, the search results are not prioritized in any way mapped to the importance of the various keywords. It would be further desirable to have methods, computer readable media and computer systems for performing document relevance classification in the context of searching, such that the preference and/or context of the keywords could be taken into account.