In general, when optical scanners scan, they generate an electronic signal representing grey tone values for points along lines across the scanned document. The grey tone values are typically represented as a digital multi-bit signal.
On documents, like e.g. technical drawings, to be scanned by an optical scanning system, it is possible to define three terms: foreground, background, and information content. The background represents the color or tone of the document, e.g. white or colored paper. The information content is typically in form of lines or characters, whereas the foreground represents the appearance of the information content in black or dark grey tones.
On a document both the background and the foreground may fluctuate across the document. This may be due to the fact that the original is dirty or multi-colored. Further characters and lines are often smeared or smudged, and are sometimes written with either very light strokes that are difficult to detect, or very heavy strokes that tend to broaden and run together when imaged. When originals with these real-life characteristics are scanned and converted into a one-bit image, a simple global threshold value is not appropriate.
Many types of histogram methods have been proposed for thresholding, but they have the drawback that either they must set a global threshold value for the document, or they must be based on sections of the document in order to be computationally feasible. These methods are characterized by trying to estimate the histogram minimal position that best discriminates the information content from the background. If the threshold value determination is based on histograms for sections of the document, it is difficult to select a global contrast or `dark/light` setting.
In order to obtain a better suppression of unwanted background patterns, that is e.g. the dirt and multicolor, and to overcome problems of an altering foreground, adaptive or dynamic threshold value determination is used. U.S. Pat. No. 5,377,020 discloses a method of providing local threshold values for zones on a original based on statistically determining the local threshold values in response to the frequency distribution of grey level values belonging to points from line segments in a zone.
These prior art adaptive threshold value determination methods, however, do not overcome a particular problem: for a typical drawing, the background constitutes the major part of the entire document area. In such a case the pixel values from the scanned document are a bad statistical material for adaptive threshold value determination because the goal is to discriminate the information content from the unwanted background patterns.
U.S. Pat. No. 4,345,314 discloses a method which, to some extent, overcomes this problem by incorporating an equalizing function in front of a 2D spatial moving average filter which provides a threshold value for a one-bit comparison. Due to the fact that the threshold comparison operates on an equalized (and filtered) set of pixel values which do not preserve the mean value in the scanned original, the equalizing function has to be selected quite close to a linear function. For a typical real-life document this method is limited because it does not provide a sufficient threshold value accuracy.