Various components of images may be processed in order to optimize or otherwise modify the visual aspects of the image. Digital photographs may be processed in several ways to enhance the visual qualities of the image and add special effects or other modifications. Images containing text may also be enhanced by methods which increase legibility, character contrast, sharpness or other visual characteristics. While both textual and graphical images may be processed and enhanced, the methods for processing text and graphical images are not the same.
Graphical images such as digital photographs and scanned graphics may be processed using techniques that remove noise, adjust color and contrast, reduce aliasing and create special effects. These techniques adjust characteristics of the graphic while maintaining the integrity of the image. Because these images typically involve many colors, shades and contrast levels, the techniques used generally vary significantly from those used for textual processing.
Text may be processed to enhance legibility or modify its visual characteristics or to convert between formats. Visual modification may involve contrast adjustment, character sharpness and other visual characteristics. Text images may also be converted from an image file format to a text file format using character recognition methods such as raster-to-text methods. Furthermore, the compression algorithms used for text may differ from those used for photographs and other graphics. Higher compression ratios are available for text than for graphical elements and overall image compression may be improved when text elements are separated out and compressed at higher ratios.
Because text and graphical elements are processed very differently, an image which contains both text and graphical elements must be partitioned into segments for optimal processing of both elements. In order to make this partition, text-containing areas must be identified and distinguished from graphical areas which require different processing techniques.
Various methods have been used to identify text elements. Some of these methods employ scan-line techniques in which rows or columns of pixels are evaluated to determine intensity or luminance levels. Consecutive intensity levels are compared to whether the intensity has changed significantly from one pixel to the next. When significant intensity changes occur, the location is marked as an edge. Changes from light to dark and dark to light may be distinguished as rising or falling intensity levels and may be identified accordingly, for example, by opposite signs. As text characters typically involve high contrast edges of opposite sign within close proximity, this condition may be used to identify the presence of text in a document. Processing of single scan-line data can produce false-positive text in high-contrast graphical image areas. These methods may also produce false-negative results in areas with bold or large text. More particularly, false-negative results may arise when a scan-line crosses the top of a character such as a “T” which has a broad area between successive opposing edges.
Other methods involve the use of segmentation into successive windows in which a series of histograms are computed. In some methods, the image may be thresholded to black and white and length of run histograms may be generated for runs of black and white pixels. The frequency of runs of a specific length may be used to determine whether text or graphical content is present.
Another known method of distinguishing between textual and graphical areas involves image smoothing followed by comparison of each pixel with a threshold density. Each pixel is classified as textual or graphical. The length or area of each region is then compared to a reference length or area. Regions with values below the reference are designated as text.
Other known methods are used to find the edges of characters for text enhancement techniques and other modifications. One scan-line-based method locates oppositely signed pairs of curvature extrema along the scan-line. Curvature is estimated by computing local angular differences in the slope of the image function along a scan-line followed by computing the local changes in angle along the scan-line. Pairs of significant curvature-extrema are taken as edge boundaries. Edge points are computed as the intervening pixel closest in value to the average intensity. Edge points are then linked across neighboring scan-lines to form straight line segments.
Another method of text edge-detection performs edge detection at two scales on binarized image data. Gray-scale or intensity data may be thresholded prior to smoothing and edge filtering. Halftone dot detection using pattern matching is performed on the binary image data. Detection of solid areas near dotted areas is also performed via pattern matching. The detected dotted and solid areas are considered regions of halftone and are subtracted from the original edge data leaving edges classified as text only.
Known methods and apparatus suffer from false detection determinations, burdensome processing requirements and the necessity of evaluating complete images or large portions thereof.