1. Field of Art
This invention relates to scan image segmentation and more particularly relates to identifying text, image, and line-art content within a scanned document.
2. Background Technology
Conventional digitization of hardcopy documents, such as out-of-print books, is slow and difficult. In part, the difficulty results from the need to identify various types of content within a given document. Some typical content types are text, images, and line art. Text generally includes small characters or symbols that are of a consistent size. Line art generally includes drawings of lines and patterns. Images generally include pictures with many intermediate levels (such as gray) to portray some picture. Text and line art are substantially bi-level (e.g. black and white). In contrast, images have more gray levels (e.g. 256 levels in an 8-bit system) in the form of halftones (e.g. spatially distributed high resolution pixels that approximate an intermediate color or shade, such as gray).
The term “scanned content” is used herein to refer to any content that is scanned and digitized. The term “image content” refers to a particular type of content, halftone images, within a scanned content. Other types of content include text and line art, as indicated above. In order to maintain quality reproduction of scanned content, different content types may be processed in different ways. However, in order to process different content types in different ways, each content type may need to be identified first.
One conventional scanning technology employs manual identification of different content types within a document. For example, a person physically draws a rectangular bounding box or other identifier around a content segment to indicate that the circumscribed contents should be processed in a certain way. A content segment refers to a section of the scanned document or scanned content that is of a consistent content type. Content that is outside of the bounding box may be processed according to a default processing mode. So, in one example, a person may draw a rectangular bounding box around a halftone image segment, but not around a text segment. The halftone image may be descreened to remove the halftones and the text may be simply scaled up to higher resolution and threshold into bi-level. Conventional descreening employs an algorithm to smooth halftones into a contone image. Unfortunately, this conventional method of drawing rectangular bounding boxes around images is slow and costly.
Another conventional scanning technology employs local optimization. Local optimization uses local information in the scanned content to determine if a content segment should be treated as bi-level text and line art or a halftone image. However, local optimization does not have enough information to segment the scanned document into individual content segments that can be processed differently according to the content type of each segment. As a result, local optimization scanning technologies can result in decreased image quality.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method that overcome the limitations of conventional scanning segmentation technologies. Beneficially, such an apparatus, system, and method would be faster and simpler than manual segmentation. Additionally, such an apparatus, system, and method would be more accurate than local optimization technologies.