The invention at hand concerns a method and a device for the binarization of pixel data.
The invention at hand relates to the field of image preparation for automatic character recognition systems. Character recognition systems can be divided roughly into two partial systems, of which the first serves the purpose of picture preparation and the second is used for the actual recognition. During the image preparation, the document to be recognized, the so-called original, is acquired with measuring techniques. For an image of the original, text passages, lines and finally individual characters are prepared and the developing character images freed of obviously recognizable defects. The characters to be recognized initially appear in the optical range and must be converted so that they are suitable for further processing. This is done with a scanner, and these days preferably with an integrated semiconductor scanner. For further processing, the continuously measured blackening of the original is generally converted immediately after the scanning to a black-and-white decision. It is preferable if the analog signal furnished by the scanner is initially converted to a discrete signal with the aid of an analog-digital converter and, subsequently, a binary image of this grey picture original is produced, which shows the image content sufficiently well for the character recognition. Background brightness and blackening in the character area can be subject to strong fluctuations. While we can expect slight fluctuations for the background brightness in the respectively interesting subregions of the original, the blackening in the character range frequently changes from character to character and even within the individual characters. Differences in the background brightness are therefore picked up by a control that is uniform for larger image cutouts, while differences in the character blackening are compensated for by a more locally active control.
The local control of the black-and-white contrast according to a binarization characteristic curve is a differentiating operation, which, in order to reach a decision concerning a pixel blackening, uses not only its grey scale value, but also the grey scale values for the surrounding region. The dimensions for the surrounding region here must be selected in accordance with the dimensions for the characters to be recognized. It is easiest to determine initially the average blackening in the surrounding region and to call a pixel black if it is blacker than the average blackening, or otherwise call it white. For the noise suppression in the area of the character background as well as the character blackening, it is also advantageous to use a binarization characteristic curve, which raises the threshold value Q in the range of low average blackening and lowers it in the range of higher average blackening. For a more strongly fluctuating character contrast, it can be advisable to control the binarization characteristic curve on the whole based on the contrast observed in a larger surrounding region and to use a binarization characteristic curve for strong print that differs from the one for weak print.
The user information for all successive processing steps is generated in the above-described binarization step of the picture preparation. Thus, any information lost at this point influences all further processing steps and restricts the productive capacity of the total system.
One problem occurring with binarization is that with a sensitive binarized picture, which has been processed with a binarization characteristic curve designed for weak printed characters, low-contrast characters can be recognized easily, but interfering structures and patterns also appear clearly. With a non-sensitive binarized picture, on the other hand, contrast-rich characters are shown clearly while interfering information and background noise are suppressed. Difficulties occur in this connection in particular with address fields, which have a structure to back up the background. In cases like these, it is hardly possible to conclude from the local observation of the region surrounding the grey picture whether a blackening is caused by written text or an interfering background pattern.