Earlier workers have sought to provide devices or procedures for recognizing and enhancing text in an image that is being analyzed preparatory to printing. Earlier (but not prior-art) innovations, for example, have analyzed color images to locate regions having, or made up of, a "correct" mix of colors--namely, mostly white and black.
One such earlier innovation operates primarily by forming histograms of the numbers of pixels of different colors, in each region of the input image. Where histograms reflect high concentrations of black and white peaks in general coincidence, the presence of black text is automatically inferred and text-enhancement techniques accordingly applied.
Such a paradigm works reasonably well, but only at the cost of an extremely heavy computational burden. Great amounts of computation are necessary to reduce the likelihood of incorrectly identifying as "text" a region that happens to have mostly white and black pixels for some reason other than the actual presence of text.
As an example, if an optical reading scanner is used to acquire the input image--and if that image being acquired was previously printed using a printer with resolution very close to that of the scanner, the resulting acquired image may spuriously appear to have the "correct" (mostly black and white) pixel mix. This can occur if the very similar pixel grids in a particular region are misaligned by just about half the pixel periodicity.
When this happens, the scanner components sensitive to a particular color (for example red) may respond to the primarily white or light spaces between pixels--but on the assumption that they are pixels. The result is a peak in the histogram for white pixels.
Meanwhile the scanner components sensitive to another particular color (for example blue) may happen to be better aligned to the previously printed pixel grid--an offset of only 1/50 cm (1/1200 inch) between the two sensor arrays can produce this condition--and will produce a peak in the histogram for dark pixels. Even though the latter are only dark and not black, the system must respond to the two peaks with a decision that the region contains text.
In any event, such earlier innovations, after "identifying" text regions whether correctly or incorrectly, then proceed to enhance those regions by "snapping" dark image elements (pixels) to pure black--in other words, for a three-color intensity specification, by adjusting or setting all three of the input color intensities to zeroes: "0, 0, 0".
In such earlier innovations, light-colored image elements in the text regions are not adjusted at all; and image elements intermediate between dark and light, in text regions of the image, are snapped to pure white. The actual enhancement thus produced is very satisfactory--but for the undesirably large amount of computation required preliminarily to identify the text regions, and the occasional errors described above.
Another undesirable characteristic of known earlier text-enhancement procedures and systems is that they use relatively large amounts of time to check for text, even in image regions which a human viewer can recognize instantly are entirely pictorial. Of course this is wasteful of computing time and thus throughput.
It is not intended to criticize those earlier innovations for as noted above they do perform excellently in nearly all respects, and in general produce superb results quickly and economically. Room for refinement, however, can be found.