1. Field of the Invention
This invention relates to image processing generally and more specifically to text recognition by Fourier transform correlation of imaged multi-line, paged text.
2. Description of the Related Art
Recognition of text in an imaged text database is required for multiple purposes. It is often required to locate a text page in a larger textual database or "book"; or it is sometimes useful to identify duplicate pages which can be deleted, to compress a database without loss of information.
Such a search is relatively easy in the context of encoded information, where characters and words are encoded as a sequence of digital bytes. However, imaged pages, in which the text pages are bitmaps or other graphic representations, are not so easily compared by a computer.
One method of comparing graphic images is cross-correlation, which is usually performed by first two-dimensionally Fourier transforming the images to be compared, then multiplying the pixels point by point, and finally inversely transforming the images back into a spatial representation to show correlation peaks. This well known method has been discussed, for example, in John C. Russ, The Image Processing Handbook, (CRC Press, 1992), pages 218-221. Advantages in speed are potentially obtained by performing such correlations optically, by an optical correlator. See for example, U.S. Pat. No. 5,311,359 to Lucas et al., and U.S. Pat. No. 5,148,496 to Anderson. Both of these patents disclose compact optical correlators capable of performing cross-correlation of digitized, pixellated images.
While correlation of images performs well with rotationally aligned images, pages of text are typically not well aligned rotationally. Text pages are usually digitized by feeding them through a digitizing "scanner", or by imaging them through a digital camera or similar device. Imprecision in feeding and scanning hardware produces varying rotational misalignments in the digitized images. The resulting images are rotated ("skewed") with respect to horizontal and vertical axes. Two otherwise duplicate images which differ by a slight rotation will not produce a strong correlation when compared. This degradation of correlation with skew angle is so pronounced that a misalignment in the range of only 1-2 degrees will significantly degrade correlation. Therefore, a method of rotationally correcting scanned text is a prerequisite to identification of scanned text by correlation.
One method of rotationally correcting text is disclosed in U.S. Pat. No. 5,235,651 to Nafarieh (1993). This method operates in the context of an optical character recognition ("OCR") system, and limited in its ability to correct for rotational error. Specifically, the patented system only detects and corrects for inversion of the page, or rotation by 90 degrees (a sideways page). While these corrections may be useful in an OCR system, they are not adequate to allow rapid identification of duplicate imaged pages, which might have errors in rotation of (for example) five degrees or less.
Another method of rotationally correcting images is disclosed by Postl in his U.S. Pat. No. 4,723,297 (1988). His method involves scanning the image repeatedly at varying search angles, optimizing "directional criteria", and then rotating the image based on the optimized directional criteria. The disclosed method only rotationally corrects skew in images during acquisition; it does not identify duplicate images. It requires many iterations to optimize, is computationally complex, and requires the predetermination of the "directional criteria." Various other methods have been developed for detecting rotational or "skew" angle in text, but they have generally been mathematically very complex or computationally demanding. Efforts at improvement have focused on reducing the computational demands of the method. See for example, U.S. Pat. No. 5,583,956 to Aghajan, et al., disclosing a method using a subspace-based line detection algorithm, and the other methods cited therein.