The present invention relates to quality of document images, and is particularly directed to an apparatus and method of quantifying the quality of a gray scale document image which has been provided in an image-based document processing system such as an image-based check processing system.
In known image-based check processing systems in which a gray scale image of the check is provided from scanning a bank check, a binarization program is usually applied to gray scale image data representative of the gray scale image of the check to provide binary image data which is representative of a binary image of the check. Since computation costs associated with a binarization program and processing the binary image data itself are relatively high, it would be desirable to establish the quality of the gray scale image of the check before the associated gray scale image data is binarized to provide the binary image data which is representative of the binary image of the check.
In accordance with one aspect of the present invention, a method of processing a document comprises the steps of (a) scanning a document to obtain gray scale image data associated with the document, (b) generating a two-dimensional histogram based upon the gray scale image data obtained in step (a), (c) applying a clustering algorithm to the two-dimensional histogram to determine a set of cluster center parameters associated with a first cluster of pixels and a set of cluster center parameter associated with a second cluster of pixels, (d) determining the standard Euclidean distance between the sets of cluster center parameters, and (e) normalizing the distance determined in step (d) to provide a value indicative of the quality of the document.
Preferably, the normalized result in step (e) is a value between zero and one, and the algorithm includes a k-means clustering algorithm. The one cluster of pixels is representative of background of the document and the other cluster of pixels is representative of foreground of the document. One cluster of pixels is located above the other cluster of pixels. The above cluster of pixels is representative of background of the document and the other cluster of pixels is representative of foreground of the document.
In accordance with another aspect of the present invention, an apparatus is provided for quantifying the quality of a gray scale document image. The apparatus comprises means for scanning the document to obtain gray scale image data associated with the document, and means for generating a two-dimensional histogram based upon the gray scale image data. Means is provided for applying a clustering algorithm to the two-dimensional histogram to determine a set of cluster center parameters associated with a first cluster of pixels and a set of cluster center parameters associated with a second cluster of pixels. Means is provided for determining the standard Euclidean distance between the sets of cluster center parameters. Means is provided for normalizing the standard Euclidean to provide a value indicative of the quality of the document.
Preferably, the normalized distance is a value between zero and one, and the algorithm includes a k-means clustering algorithm. The one cluster of pixels is representative of background of the document and the other cluster of pixels is representative of foreground of the document. One cluster of pixels is located above the other cluster of pixels. The above cluster of pixels is representative of background of the document and the other cluster of pixels is representative of foreground of the document.