Data classification methods using machine learning techniques are described, for example, in published United States Patent Application 20080086433.
The following terms may be construed either in accordance with any definition thereof appearing in the prior art literature or in accordance with the specification, or as follows:
Richness: the proportion of relevant documents in the population of data elements which is to be classified. Here and elsewhere, the word “document” is used merely by way of example and the invention is equally applicable to any other type of item undergoing classification.
Precision: the number of relevant documents retrieved divided by the total number of documents retrieved. Precision is computed as follows:
      Precision    =                                    {                      relevant documents                    }                ⋂                  {                      documents retrieved                    }                                                {                  documents retrieved                }                  
Recall: the number of relevant documents retrieved divided by the total number of existing relevant documents (which should ideally have been retrieved). Recall is computed as follows:
      Recall    =                                    {                      relevant documents                    }                ⋂                  {                      documents retrieved                    }                                                {                  relevant          ⁢                                          ⁢          documents                }                  
F-measure: the harmonic mean of precision and recall. The F-measure is an aggregated performance score for the individual precision and recall scores. The F-measure is computed as follows:F=2·(precision·recall)/(precision+recall).
The disclosures of all publications and patent documents mentioned in the specification, and of the publications and patent documents cited therein directly or indirectly, are hereby incorporated by reference.