1. Field of the Invention
The present invention pertains to the field of image analysis, and more particularly, to bi-level image segmentation, analysis, and compression.
2. Description of Related Art
In the field of image analysis, image recognition requires segmentation and interpretation of connected component objects found within an image. For bi-level images, a connected component object is a group of pixels of a given binary level or value, (e.g., 1 or black), completely surrounded by a group of pixels of the alternative binary level or value (e.g., 0 or white). Methods and apparatus for identifying connected component objects within an image, e.g., by performing line-by-line connected component analysis, are known in the art. One such example is disclosed in U.S. patent application Ser. No. 09/149,732, entitled xe2x80x9cSegmenting and Recognizing Bi-level Images,xe2x80x9d filed on Sep. 8, 1998 and assigned to the same assignee as the present invention. U.S. patent application Ser. No. 09/149,732 is hereby incorporated by reference in its entirety in the present application.
Following connected component segmentation (or identification), image recognition typically proceeds by extracting a set of features for each connected component object which is needed by a given classification method to uniquely recognize a targeted object. Once this object data has been extracted, well known character recognition methods such as Bayesian, nearest neighbor, and/or neural network analysis may be used to classify each object by comparing object features with features obtained from a list of reference objects. When the features are similar enough, the unknown object is recognized and the known reference object can be, substituted for the previously unknown object during further document manipulation.
Various character recognition methods using feature extraction have been developed. For example, an intuitive, easy to implement and comprehensive method for feature abstraction is disclosed in U.S. patent application Ser. No. 09/661,865, entitled xe2x80x9cRecognition in Clustering of Connected Components in Bi-level Imagesxe2x80x9d filed on Sep. 14, 2000 and assigned to the same assignee as the present invention. U.S. patent application Ser. No. 9/661,865 is also hereby incorporated by reference in its entirety in the present application.
Other work in the field of imaging analysis has been directed to the compression of image data, either before or after a certain level of recognition occurs. One such compression algorithm is disclosed in the Blue Book of the International Telegraph and Telephone Consultative Committee (CCITT), Volume VII, Fascicle VII. 3 xe2x80x9cTerminal Equipment and Protocols for Telemetic Services Recommendationsxe2x80x9d T.0-T.63 (p. 27 xe2x80x9cTwo-Dimensional Codingxe2x80x9d) (adopted at IXth ITU Plenary Assembly, Melbourne, Australia, Nov. 14-25, 1988). The relevant portions of this article are hereby incorporated by reference into the present application. The compression algorithm disclosed therein is known as xe2x80x9cMODIFIED MODIFIED RELATIVE ELEMENT ADDRESS DESIGNATED CODExe2x80x9d (MMREAD or MMR). While MMR has been used to compress image data with a fair degree of effectiveness, its usage has not been optimized in all circumstances. Accordingly, MMR compression could be further modified to improve performance in a variety of circumstances; such improvements would include greater compression capability, reduced processing times, more efficient memory utilization, etc.
In one form, the invention comprises an apparatus (100) for coding one level of a bi-level image representing a connected component object (300). The apparatus can include an image segmenter (202) for identifying the connected component (300), a graph builder (203) for creating a graphic representation (400) of the connected component (300), a referencing module (205) for identifying reference nodes (310, 315, 338) of the graphic representation (400), a coding module (206) for successively coding pixel runs of the graphic representation (400), and a closing module (207) for marking the end of the compressed data.
In slighter greater detail, the graph builder (203) will be understood as creating a graphic representation (400) which includes a plurality of nodes and strokes (302, 303, 304, 305) such that the referencing module (205) can identify reference nodes (310, 315, 338) and at least one pixel run (311, 316, 317, 339) which can be coded relative to the reference nodes. Additionally, the successive coding performed by the coding module (206) entails coding a first pixel run (311, 316, 317, 339) of each stroke relative to a respective reference node and, additionally, coding the remaining pixel runs (312-314, 318-336, 317-337, 339-342) of each stroke relative to adjacent and previously coded pixel runs. This results in a coded list of pixel runs for each stroke. After the closing module (207) marks the end of each coded list for each stroke, the process (508) is repeated until the appropriate number of reference nodes and strokes have been fully coded. At that point, the entire connected component (300) will have been compressed.
In a particularly preferred form of the invention, the graphic representation generated by the graph builder (203) is an L-graph representation (400) including at least one beginning node (310) and one hinge node (315, 338), wherein each stroke (302-305) is associated with one of the beginning and hinge nodes, and wherein the referencing unit (205) identifies a node associated with each stroke as the reference node for that stroke. Moreover, it is particularly preferred that the coding module operates in either a horizontal mode or a vertical mode only, and that it never operates in the horizontal mode during two consecutive coding operations.
Other aspects of the invention are more narrowly tailored to methods (508) and apparatus for compressing individual strokes comprising plural substantially parallel pixel runs of the same level of a bi-level image. In such embodiments, the compressing algorithm (508) for a given stroke is at least generally similar to that described above.
Other desirable features of the present invention include the utilization of modified Huffman coding techniques for coding various pixel runs and coding each pixel run using two code-words, the first code-word encoding the beginning of the pixel run and the second code-word encoding the end of the pixel run. While the pixel runs forming each connected component object can be all black, with the area surrounding the connected component being any color other than black, it is particularly preferred that the pixel runs of the connected component be black pixels and the surrounding pixels be white pixels.
Method embodiments of the present invention are also disclosed. In pertinent part, such method embodiments include (a) determining a reference position of the strokes such that at least one pixel run can be coded relative to a reference position; (b) successively coding the pixel runs of the stroke such that a first pixel run is coded relative to the reference position and the remaining pixel runs are coded relative to adjacent and previously coded pixel runs; (c) wherein the coding includes coding of the pixel runs in either a horizontal or a vertical mode; and (d) wherein coding in the horizontal mode never occurs during two consecutive coding operations. Among other preferred method embodiment features are the use of modified Huffman coding techniques to code the pixel runs of a connected component and coding the connected component in either a horizontal mode or a vertical mode only. Additionally, the step of coding could include coding each pixel run using two code-words the first code-word encoding the beginning of the pixel run and the second code-word encoding the end of the pixel run.