1. Field of the Invention
The present invention relates generally to electronic scanning devices, and, more particularly, to a system and method for manipulating regions in a scanned image.
2. Related Art
Scanning devices are useful in many applications where it is desirable to transfer an image from printed form into electronic form. Scanners capable of reading and converting a page into electronic format have been available for quite some time. Typically, a scanner will electronically read a page, classify the different types of images on the page and electronically store the information for later presentation and use. The types of classifications of a scanned page typically include text, photographs, drawings, charts, tables, business graphics, equations, handwriting, logos, etc. These different parts of a scanned image are typically classified into regions by a user after the scanner has scanned the page. Some scanners are capable of determining the classifications of particular regions of a scanned page in accordance with predetermined instructions.
For example, in a page including text and drawings, some scanners will scan the page and will classify and store the text information as a text region and will classify and store the drawing information as a drawing region, and so on. Unfortunately, it is possible for the scanner to interpret some regions in a manner different from that which a user desires. Multiple interpretations of the same scanned information are possible regarding the attributes of a scanned image that should be presented to a user. For example, a scanner, or more properly the scanner analysis code, might classify a particular section of text (e.g., a large capitalized first letter of a paragraph) as drawing information, or as another example, the scanner analysis code must determine whether text over a colored background should be presented as text or as part of a bitmap that includes all of the background pixels. These predefined classifications may be acceptable for some users of the scanner, but other users may wish to have the ability to alter, or adjust, the sensitivity of the scanner for each classified region type, or alter the manner in which the regions are grouped, or clustered.
For example, one user might wish only to scan a page for the purposes of entering the information on that page into a word processing document with no further manipulation desired. A different user may wish to scan the same page for the purposes of manipulating the information on the page in a more sophisticated manner.
Furthermore, once a region or set of regions are analyzed and interpreted by the scanner in a particular format and presented to a user, the regions are typically non adjustable.
Generally, there are four principal classes of regions.
1. "Primitive" vs. "Composite"
A primitive region is the simplest possible representation of a region. For text, therefore, a primitive is a single word. For a table a primitive is a single cell. For a business graphic a primitive is a single graphic element or a single textual element.
A composite region is comprised of two or more region primitives. For example, a text paragraph is itself comprised of text line composites, which are comprised of text word primitives. Tables are comprised of their cell, horizontal rule, vertical rule, column and row primitives. Charts, graphs and equations are comprised of combinations of text, mathematical character, rule and drawing primitives. Boxes and cartoons are comprised of drawing, text and/or handwriting primitives.
2. "Enclosed" vs. "Containing"
An "enclosed" region is a region whose entire set of pixels fall within the boundary of another "containing" region. Important examples of enclosed/containing region combinations include text, photographs, etc., that are within containing boxes; cells within tables that have containing rules; regular or "inverse" (i.e., lighter) text over a photograph or drawing; text on business graphics; and text over shaded (often uniformly shaded, or highlighted) backgrounds.
3. "Foreground" vs. "Background"
A "foreground" region is a region intended to convey information such as text, photographs, drawings, equations, handwriting, tables, graphics, etc. A "background" region is not intended to convey information, but often intended to provide segmentation of a document or isolation of one segment of a document from another. Background regions also include such elements as the scanner lid (which may be white, black or gray); the lid of an automated document feeder, which may include non-uniform areas; and "fringing" patterns caused by the edge of the scanbed and by the three dimensional aspects of the scanned document (e.g., the sides of pages of a book that is being scanned).
4. "Hidden" vs. "Visible"
"Hidden" regions are regions that have been identified by the document analysis code but are not presented to a user. Examples include obvious "junk" regions on the document such as page folds, staple marks, punch holes and blotches; background regions that are assumed to be less important to the user than the overlying regions (e.g., text or photographs); and regions corresponding to the scanning process (e.g., the fringes along the edge of the scanner, the scanner lid, or the automatic document feeder footprint). "Visible" regions are the set of regions identified by the analysis code that are presented to the user.
These regions are presented to a user by an automated document processing (page analysis code) system contained within the scanner software and typically presented to the user during a "scan preview" operation. During scan preview the user views on a display the image that will be scanned. The information viewed includes the region types and the information contained within each region. In the past a user of a scanner has been unable to alter the information contained within each region or the format of the presentation of the regions.