The present invention relates generally a method for comparing a scheme for ranking the similarity of a set of objects to a standard object, to a standard ranking scheme. One specific application is taught whereby a document image ranking scheme can be compared with a given standard ranking scheme.
There are many applications where document images are processed to determine their similarity with other document images, or models of document image types. These include classifying groups of similar document images. Document type classification is used in a variety of applications including database management and document routing through a computer network. Furthermore, by identifying the class to which a document image belongs, one can expect certain information contained in the document to appear in selected regions of the document. Thus, once a document type is identified, data may be relatively easily extracted from the document by focusing on that specific region.
Due to the demand, various schemes have been developed and are known, for ranking document images by their degree of closeness to a set standard document image. Furthermore, over time and experience with a particular scheme, those skilled in the art become familiar with what they can expect from that scheme in terms of reliability and performance. However, there is no objective method for measuring the comparative performance of one ranking scheme to another for a particular application.
Similarly, ranking schemes are used in a variety of applications, not only document images. Indeed a ranking scheme can be used whenever it is desired to order the relative similarity of multiple objects to a standard object. An object herein refers to any item that can be compared with a given standard, including, but not limited to a document, text file, or image file. However, there is no known objective method for comparing the various schemes for ranking the relative similarity of the various objects to the standard.
In accordance with the present invention an object ranking scheme is compared with a known standard, or ideal ranking scheme for a comparative assessment of the performance and reliability of the ranking scheme being tested as compared with the ideal ranking scheme. The same set of objects are processed and ranked by both ranking schemes. A higher ranking indicates a relatively closer similarity to a standard object than a lower ranked object.
The ranking scheme being tested is examined for all objects of interest. Referring to each of those objects as subject objects, the test ranking scheme is charged for each object, referred to herein as a swapped object, which it ranked higher than a subject object, which the ideal ranking scheme ranked lower than the subject object. In one specific embodiment of the present invention the objects are document images.