The present invention relates to document image identifying, processing, storing, transmitting and receiving, particularly documents such as vouchers, completed forms and the like that have standardized formatting, including shading, with modifications.
In document filing systems used in practical applications at the present time, the document to be stored is first scanned, then the scanned image is digitized by binarization processing or dither processing. The digital image thus produced is compressed using a modified Hoffman encoding (MH) modified read encoding (MMR) or the like, and then the compressed digital image is stored, for example on an optical disk or other recording medium.
Optical character readers now in practical use binarize the image by a scanner then subject the thus produced image to character recognition. Apparatus of this type are not capable of recognizing character patterns printed over shading. For enabling characters to be read over shading by currently available optical character readers, color of the dots used in the shading is limited to a specific drop-out color and only the drop-out color portions are eliminated prior to optical character recognition. That is, the shading is limited to only a specific color that can be dropped out by filtering or the like, which color is not used to produce the characters. For example, such technology is disclosed in Japanese unexamined patent publication 56(1981)-2073. More specifically, the shading may be green or red to be removed by an optical filter such as a green or red glass, so that the characters that are not green or red may then be subjected to optical character recognition.