1. Field of the Invention
The present inventions relate to methods and apparatus for analyzing images, and have particular application to analyzing scanned images, such as for identifying text.
2. Related Art
Large amounts of information are published or distributed to people in a printed format, such as in document form. The people originally receiving the documents may also have received digital or electronic versions of the information, but receiving information in both forms is not yet common. The person originally receiving the information may want to convert it to an electronic form. Additionally, others who may later receive the same documents may want to have the information in electronic form, such as for redistribution, editing or archiving. One common way of converting information on documents to a digital or electronic form is to scan the documents and store the resulting images.
Scanned images can be stored in any number of different formats, such as bitmaps, JPEG files, GIFs, and the like. The storage format may often be determined by the ultimate destination for the information. For example, information incorporated into a Web page may be stored in a different format than information incorporated into a word processing document, which may be different from the storage method for use in an audiovisual presentation. Additionally, information that is received only in all text form, or in text form combined with graphical or pictorial images, may be sent to a word processing application for editing.
In many instances, the destination for a scanned image determines how the image is initially scanned, such as the scan settings. For example, if an image is text only, the scan can be set to a low bit depth and high-resolution so that the image is best suited for Optical Character Recognition (OCR), reproduction and printing. For a graphical or pictorial image, the scan settings are more often set for a high bit depth and lower resolution. Therefore, if a person wants to put a text-only document into electronic or digital form, for subsequent editing, the scan settings should be a low bit depth and high-resolution. Therefore, before a preview scan of the image, and at least before any final scan, the scanner should be set at 300 dpi and black and white. The resulting image can then be processed, such as de-skewing, auto cropping and OCR.
Many image scanners include a user interface by which the user can select the desired settings. If the person knows what settings are necessary and knows how to apply them, the desired image data should be successfully received for later processing. However, if the person did not make the proper settings, the resulting digital data most likely will not be in the appropriate format for the desired end use of the data. For example, an image ultimately intended to be retrieved as an editable text document that is scanned with a low resolution and a high bit depth will not produce a data file that can be suitably processed through OCR.
Scanned images are often processed after scanning to make the images appear more like the original document. For example, a scanned text document which is intended to be displayed only as a picture or graphic depiction of the original may depict the text on a light gray or slightly yellow background because the digital data representing the background is not always given or assigned a zero value or other numerical value representing 100 percent white. Therefore, the image will not appear like the original. To improve the appearance, the image data file is processed to bring the background closer to white. Additionally, the image data file may be processed to make the text appear sharper. However, if the correct settings are not applied to the scanner, or if the proper destination for the digital data is not selected, the desired processing may not be carried out on the image.
Different hardware and scanners and different environments produce different scan results for a given image. For example, different digital values can be assigned to all black and all white pixels. Consequently, the point at which a pixel will be treated as white or as black may cause some pixels to be identified as black or white and other pixels to be a shade of gray. With color scanners, detected colors, including black and white may vary as a function of temperature and ambient light. An all black and white image may be converted to digital data that would be displayed with the light gray or light yellow background. Additionally, parts of the black text may be depicted as dark shades of gray. Consequently, if the image is not properly classified as black and white text, the image may not be properly processed, and it would not be displayed in such a way as to look like the original.
Methods and apparatus are provided which improve the likelihood that an image containing text will be properly classified. One or more aspects of the present inventions improves the possibility that analysis of the image produces the correct classification, both quickly and with little or no user intervention. In one aspect of the present inventions, images can be properly scanned as text only, graphic only or a mix of both without requiring any user input. A processor can make a determination of the proper scan mode, including proper resolution and bit depth, even without user input. The image can be properly classified even during or by the end of a preview scan, not only so that the proper settings can be applied to the scanner for the final scan, but also so that the data file can be placed in the proper format for its ultimate destination.
In accordance with one aspect of one of the present inventions, image analysis can be carried out by analyzing at least part of an image, such as a scanned image, to determine whether the image elements, such as a pixel, are black and white text or graphic. The image elements are grouped by type, and image elements that are identical or sufficiently similar in type and have sufficient proximity are grouped together. An indication is sent to the scanner controller as to how the image has been classified. For example, a processor may send a signal to the scanner controller that the image part analyzed is text only, graphic only or a mixture of text and graphic. In this way, a processor can be used to analyze and classify an image or a part of an image and to apply appropriate settings to the scanner for a final scan. Additionally, the classification of the image or part of the image can be communicated to the user through a user interface, for confirmation or to allow the user to make adjustments, further settings or the like.
In another aspect of one of the present inventions, a threshold value for what image elements will be treated as all white or all black can be determined dynamically. For example, as a scan progresses, the background of the image may change slightly, the ambient light or the equipment temperature may change slightly and affect the image scan. By dynamically adjusting the threshold value for black or white, these variations can be taken into account so that the resulting file is a closer representation of the original image.
In a further aspect of one of the present inventions, gradient analysis is used to help identify whether an image is changing between dark and light. The gradient analysis helps to define boundaries in the image, which also helps to group identical or sufficiently similar image elements.
In accordance with another aspect of one of the present inventions, an image or parts of an image can be analyzed even while the scan is progressing. For example, data from a scan can be received by a processor on a line-by-line basis or in strips with groups of lines as the preview scan or original scan progresses. The real-time analysis is especially helpful in identifying images that are a mixture of text and graphic, and therefore would not be scanned or processed as text only or graphic only. In such a case, for example, the image analysis can be immediately terminated and the image data sent directly to the processor for final processing and/or output to the ultimate destination. Such an image would not be a candidate for text only or graphic only processing.
In another aspect of one of the present inventions, part of an image classification may include classifying an image element or pixel as white, white edge, gray, gray edge and black. These classification labels may be particularly helpful in identifying text-only images. In one form of one aspect of the inventions, the pixels may be classified using a gray scale method and in another form they may be classified using a chroma method. The pixel classification may be made with any desirable tolerances, as selected by the person designing the software. For example, pixels of wider varying characteristics can be classified into a relatively few classifications, so that pixel detail is ignored, lost or discarded in favor of fewer classifications. Alternatively, the larger number of pixel types can be accounted for by having more classifications. In either case, the grouping of image elements by type can be carried out as tightly or loosely as desired, and the method may depend on whether the tight thresholds are set either early in the process at the pixel classification stage or later during the grouping. For example, pixel anomalies, such as a single lighter pixel in a group of dark pixels, can be classified more loosely as a black pixel along with the adjacent pixels by setting a wider range or different threshold than otherwise. Thereafter, all the pixels could be grouped as text, for example. Alternatively, the lighter pixel could be classified differently than the surrounding pixels with more precision, and thereafter grouped with them by setting a lower threshold or cut-off for the number of dark pixels in a region that are used to make an all black region.
In a further aspect of one of the present inventions, components of the image can be connected so as to create regions, groups or aggregation""s of pixels that have been assigned an identical classification. In one form, pixels are grouped according to two different levels, such as background versus non-background. In another form, they may be classified as black, gray, gray edge, color, color edge, white, and so on. These latter classifications may be sub-classifications, and if a majority of the pixels have been assigned a single sub-classification, they may be classified as text-only pixels. Otherwise, they may be classified as graphic or pictorial.
The evaluation of an image may be done on a pixel-by-pixel basis to determine whether or not the pixel it is black and white or graphic.