1. Field of the Invention
The present invention relates to improvements in methods and apparatuses for image processing, and more particularly to an improved method and apparatus for automatically determining and correcting the orientation of an image for document processing, specifically for determining and correcting the orientation of an image of a document that is 90.degree. or 180.degree. out of alignment.
2. Background and References
The present invention has wide applications in document image processing, both in traditional optical character recognition processing, and in scanned image processing, as well. In optical character recognition (OCR) systems, typically a bitmap representation of an image is accepted as an input, generally created by an optical scanner. In the operation of such systems, a user usually feeds the page or pages to be scanned into the document scanner by hand. However, a serious and common problem often occurs when the user feeds the pages in the wrong orientation, for example, with the top of the image on the document rotated either 90.degree. (clockwise or counter-clockwise) or 180.degree. (upside down) with respect to the expected orientation of the characters on the page.
Present day commercial OCR systems do not typically check for such a misorientation condition. Instead, they proceed to process the document as normal, by segmenting the page and sending bitmaps of the individual characters to the character classifier module. As the bitmaps are rotated from the expected orientation, the results are poor. Few characters are usually identified at all, and the characters that are identified are generally incorrectly classified. In addition, a long time is required to determine the bad results, since the patterns are unfamiliar to the classifier, and the classifier typically tries several subroutines that are slow and not expected to be run often. Finally, no warning is given to the user that the results are bad.
U.S. Pat. No. 2,905,927 to Reed describes a method and apparatus for recognizing words that employs three scans to determine the characteristics or pattern of the word to be identified. The upper scan obtains information indicating the number and position of full-height symbols and the lower scan derives information indicative of symbols extending below the base line. The center scan acquires information relative to the number of symbols in the word and the symbol spacing.
U.S. Pat. No. 4,953,230 to Kurose describes a system for processing and correcting the skew of a document image that scans the document and determines the location of a first appearing black pixel. Thereafter in subsequent contiguous scans when no black pixels are detected in a scan, it is determined that the right-hand end of the character line has been found. A mathematical determination of the skew of the document is determined using the line numbers and widths of the character lines in order to find the skew or slope of the character lines in reference to the horizontal (or vertical) scanning.
U.S. Pat. No. 4,723,297 to Postl describes a method for automatic correction of character skew in the acquisition of a text original. The text original is in the form of digital scan results, and the method contains an "identification of skewed position" step and an "electronic rotation" step. The identification of skewed position can be determined by the search scan method or the search sweep method. The search scan method of skew determination involves summing and comparing the scan results between scanned search lines beginning with Y=O on an XY plane. The search sweep method of skew determination utilizes Fourier transforms for summation of scanned regions. An angle, .alpha., is established in either skew determination step and electronic rotation of the scanned image provides correct image alignment.
U.S. Pat. No. 4,926,490 to Mano describes a method and apparatus for recognizing characters on a document, even if the document is skewed or not aligned with the axis of a typical segmentation apparatus, such as a scanner. A plurality of rectangles are formed surrounding respective character images, with position data for each rectangle stored in a first table in which plural position data of the rectangles are arranged in order from the left-most rectangle to the right-most rectangle in the X direction of the XY coordinates of the image buffer. By determining the rectangles belonging to one character row in the first list and calculating the positions of the bottom left corners of the rectangles, skew of the document is calculated. Vertical positions of the rectangles compensated by the skew in the Y direction are calculated to transfer the position data of the rectangles belonging to the first character row to a second table. The image data surrounded by the rectangles specified by the position data in the second list are sequentially supplied to a character recognition unit.