1. Technical Field
The invention relates to the conversion of images of symbols to an electronic format. More particularly, the invention relates to the deskewing of images of symbols, where the symbols are positioned along a baseline that is not a straight line, but rather is a wavy or a curved line.
2. Description of the Prior Art
Over the past 40 years, the nature of the work force has changed dramatically. The post-industrial era of the later 1950's paved the way for the information era of today. Although the number of manufacturing jobs has steadily decreased, there has been a significant increase in the number of information-related jobs. However, despite the widespread use of computers, a majority of this information still resides on paper. Thus, data in an electronic format still represents only a small percentage of the total amount of information used by most organizations.
Text recognition technology provides a useful tool for converting information stored on paper into an electronic format. Optical character recognition (OCR) technology typically converts the electronic output of a scanner into computer usable files through a series of complex computer algorithms. The electronic image produced by the scanner is comprised of white or black pictures elements referred to as pixels, has a desired resolution, which for text is presently 90,000 pixels per square inch. Each pixel is rendered as a digital value of 1 or 0, which represents either a white pixel or a black pixel. Before optically scanned text is actually recognized, it may be displayed and manipulated as an image on a computer monitor. At this point, the electronic information has not yet been recognized as text, but is merely an image or picture of the text.
OCR algorithms typically recognize scanned text images in two steps. First, they analyze the image of the page to determine which parts of the image are text and numeric data, and determine the structure of the page layout. For example, tables, columns, and paragraphs are identified and located. Next, the characters are examined and identified to produce a file of character data contained in words, including page formatting information, such as tables, columns, paragraphs, spacing, bold characters, italics, and underlining that are necessary to allow manipulation of the data as a text file.
Deskewing is a well understood problem in imaging, especially related to OCR technology. In page scanners, the information that is to be imaged is placed on a flat platen or fed through a set of rollers. In hand held scanners, such as the Omniscan product manufactured by Caere of Los Gatos, Calif., the user pulls the scanner vertically down the page; and the DataPen product manufactured by Primax Electronics of Taiwan, R.O.C., the user pulls the scanner horizontally across the page, typically resulting in a wavy baseline (see, for example U.S. Pat. Nos. 5,182,450 and 5,301,243). In both cases the horizontal baseline of symbols may not be flat across the image, but may slope upwards or downwards at a constant angle. This skew confuses the OCR algorithms, and a deskewing process therefore needs to be implemented to recreate a horizontal baseline. While there are OCR algorithms that are tolerant of skew, the majority and the best of such algorithms are not tolerant of skew.
L. Barski, Rotationally Impervious Feature Extraction For Optical Character Recognition, U.S. Pat. No. 5,054,094 (1 Oct. 1991) discloses a method of creating rotationally impervious feature extraction for OCR. Such feature extraction is accomplished through radial intercept feature extraction, counting the number of intercepts for circles of varying radii. This technique is designed to recognize an individual character, but not a line of symbols. Thus, it does not solve the problem of displacement of a sequence of symbols from a straight the baseline.
G. Kulik, Method of Deskewing An Image, U.S. Pat. No. 5,233,168 to (3 Aug. 1993) discloses a method for derotating an image by shifting pixels. This is a computationally simple method of derotating an image. However, it does not solve the problem of recognizing a sequence of symbols that is positioned along a wavy baseline.
Y. Lee, Method of Detecting The Skew Angle Of A Printed Business Form, U.S. Pat. No. 5,054,098 (1 Oct. 1991) discloses a process in which samples are taken to measure skew, and from which samples a histogram is created. The most frequently occurring skew angle is considered the skew angle for the entire document. The baseline in this technique is assumed to be a straight line.
A. L. Spitz, Determination Of Image Skew Angle From Data Including Data In Compressed Form, U.S. Pat. No. 5,245,676 (14 Sep. 1993) discloses a method of determining an image skew angle. The problem solved is one of having a baseline with a fixed angle. The method used is to select certain features on the image on which to base a skew angle and assign weights to those features.
In all of the above patents, there is a single skew angle constant for the entire image or at least the entire line of text. These patents do not attempt to deskew an image comprising a sequences of characters positioned along a wavy baseline.
It is desirable to produce a hand-held scanner that is able to scan text one line at a time, e.g. as a scanning wand or pen. The user would hold such scanner as with a pen and draw the scanner horizontally across the page to enter information into a computer one line at a time, much like a highlighting pen is used to highlight text. Such scanner scans symbols one line at a time. Because of this manual horizontal action, the image created by the scanner is not distorted or skewed in the conventional manner, where the skew is of a constant angle, but is subject to a new problem, referred to herein as baseline waviness.
The baseline waviness is created because there is no flat baseline of symbols, or even a straight, but angled baseline, when hand scanning across a page. Thus, the baseline assumes a wavy profile.
As discussed above, presently known deskewing techniques do not attempt to flatten the waviness of the baseline. Therefore, OCR cannot accurately read information scanned by such devices. Unfortunately, hand held scanners, especially scanning pens, cannot be considered sufficiently reliable for large scale adoption in the information industry until such problems as the wavy baseline produced by hand scanning are resolved.