1. Field of Invention
This invention relates to optical scanners, and more specifically to devices that are designed to scan single lines of text characters.
2. Description of Prior Art
Early patents in the field of optical scanners that were designed for scanning single lines of text either addressed the use of two dimensional imaging detectors or did not discuss the problem of scan speed or speed uniformity when a one dimensional scan detector array was employed.
Scanners employing two dimensional array detectors are disclosed in U.S. Pat. Nos. 4,817,185 by Yamiguchi et.al, 4,158,194 by McWaters et.al., and, 3,947,817 by Requa (deceased) et.al.
One dimensional array "free running" scanners are disclosed in U.S. Pat. Nos. 5,256,866 by Conversano et.al, and, 4,497,261 by Ishikawa et.al. As discussed below these patents do not address the situation of accurately reproducing the text image in a computer.
With the development of sophisticated optical character recognition (OCR) software the necessity of having an accurate image reproduction of the scanned text became apparent; i.e. the image of the text had to have the proper aspect ratio in order that the OCR software could function efficiently (aspect ratio is the ratio of the text height to the text width). If the aspect ratio was not precisely represented in the stored image of the text, then the OCR would be unable to accurately recognize a large percentage of the scanned characters, leaving an unacceptable number of characters to be manually processed.
With this insight concerning OCR software limitations, the next generation of text scanners, which employed one dimensional linear detector arrays that were to be linearly scanned along a line of text, employ an independent wheel or ball to roll along the surface of the document being scanned. This wheel is employed to provide tachometer-like information so that the detector array could be commanded to scan the document at preset and rigidly uniform distances along the line of text. Thus, for example, a scan could be initiated every 1/300 of an inch and the resultant text image would have an image resolution of 300 dots per inch (provided that the detector array had such resolution itself). This action is variously described as "tachometer", "synchronized", "commanded", or "encoded" scanning methodologies. Such scanner devices are disclosed in U.S. Pat. Nos. 5,595,445 by Bobry, 5,301,243 and 5,574,804 by Olschafskie et.al, 5,430,558 by Sohaei et.al, 5,175,422 by Koizumi et.al., 5,083,218 by Takasu et.al, 5,012,349 by deFay, and 4,797,544 by Montgomery et.al. One of the major problems with such encoder devices is that it is difficult to ensure that they track along the substrate, that is, the surface upon which the text characters reside, in a reliable and reproducible manner, i.e. they frequently slip or skip along the substrate and thereby fail to reliably reproduce an accurate text image. Also, continued use tends to produce a "dirty" encoder wheel/ball mechanism due to pick up of debris from the substrate. This requires continued attention and cleaning in order to provide an accurate encoding system. These are significant problems that are circumvented by this disclosure.
Of these patents, those by Olfschafski et.al. (U.S. Pat. Nos. 5,301,243 and 5,574,804) describe the most compact of such devices in that the described scanner is pen-shaped and pen-sized. It retains a small ball shaped roller in the scanner nose piece which is employed as a scan encoder. That the encoder mechanical components are so small means that the problems of failure to track accurately and debris dirtying of the mechanism will be more of a problem for this device than for the other, larger systems.
One of the primary claims in the Olschafskie et.al. patents is for a scanner which has a transparent nose piece that permits intimate observation of the text being scanned and thus permits more precise scanning.
An additional problem which is present in "free hand" scanning is that the scanner is generally not precisely aligned with the baseline of the text being scanned. This is known as baseline "skew". Additionally, the scan baseline is usually not a straight line, but rather is curved or wavy.
One approach to compensating for both of these problems is addressed in U.S. Pat. No. 5,638,466 by Rokusek. The approach disclosed in this patent is very careful and is quite complex in application. In our work we discuss a relatively much simpler methodology for baseline corrections that prove quite adequate for the situation in which text of a known (or readily determined) height is being scanned.