(1) Field of the Invention
The present invention relates to an image data compression apparatus, and more particularly to a method and an apparatus for pattern matching encoding, in which binary data are compressed by using pattern matching.
(2) Description of the Related Art
U.S. Pat. No. 5,303,313 discloses a process of inputting a binary stationary image using a scanner and compressing bit map data of the input image. In this disclosed process, input image data obtained using a scanner or the like is precompressed by a run length encoding or like processes, and then cut to patterns. Then, template data and pattern position data are generated by making matching of each cut pattern and a registered template. In this way, the input bit map data is compressed into data comprised of template codes and pattern position data.
Another image data compression apparatus has been proposed in order to improve data compressibility in applications to hand-written characters or indefinite type characters. This system has a temporary library, which is capable of newly registering and updating bit map data to be compared with input bit map data according to the input data.
FIG. 1 is a block diagram showing a prior art binary stationary image pattern matching encoder having a temporary updating library. Referring to the Figure, text image data inputted from a scanner, is binarized and smoothed for improving the compressibility in encoding and the visual image quality. A pattern extracting unit 101 executes pattern extraction from the smoothed data, and outputs bit map data of a mark pattern to a matching unit 102. The matching unit 102 compares the mark pattern with bit map patterns stored in a temporary updating library 103. When the ratio of unmatched pixels to all pixels is not above a predetermined ratio, the matching unit 102 judges the result of the comparison as being matched, and when the ratio is above the predetermined ratio, it judges the result as being unmatched.
When the matching unit 102 Judges the result of the comparison as being matched, it outputs to a multi-symbol arithmetic encoding unit 105 the ID of the matched bit map in the temporary updating library and the positioning data used when the matching is made whereby the data compression is made. A match error correcting unit 104 inverts match error pixels of bit map data of mark patterns which are judged as being matched. The corrected bit map data are ID registered in the temporary updating library 103. Bit map data of mark patterns which are judged as being unmatched, are ID registered in the temporary updating library 103 with no correction. In the temporary updating library 103, library bit maps of low matching frequencies are deleted.
In carrying out the compression of the mark pattern bit map data, the estimation template pixels which are generated in an unmatched mark pattern template generating unit 106 are used for unmatched mark patterns, while the estimation template pixels which are generated in a matched mark pattern template generating unit 107 are used for matched mark patterns.
The above prior art system, however, has problems which are posed by the precompressing and smoothing processes executed prior to the matching process for improving the data compressibility and also by the updating of the temporary updating library with the data obtained after the match error correction.
A first problem concerns the image quality. In the match error correcting process after the matching process with the temporary updating library, the match error pixels are inverted. This process is executed in order to improve the data compressibility. The match error pixel inversion, however, results in an image quality sacrifice. Besides, the bit map data of the mark patterns obtained in the match error correcting process are successively registered in the library. Successively registering the mark pattern bit map data results in accumulation of image distortions arising from the match error correcting process, thus resulting in image quality deterioration. The image quality deterioration is particularly pronounced with document texts printed using fonts, which are obtainable with personal computers and word processors recently finding increasing applications.
A second problem is that optimization is lacking from the standpoint of the data compressibility improvement.
For example, bit map data of a mark pattern of the same character is subject to slight variations depending on the accuracy of the scanner, thus causing the mark pattern bit map data of the same character to be updated in the temporary updating library. Such updating of data results in storing a plurality of different data of the same character in the temporary updating library. With a finite library memory, therefore, there arises a shortage of types of character marks of bit map data for matching.