Many applications of scanned document data depend on the document content being properly oriented to the horizontal and vertical directions represented by the rows and columns of the digital image. For example, OCR (Optical Character Recognition) provides a computer-useable interpretation of scanned letters and words. It is well known that the performance of OCR is dependent on the image quality and the skew angle of the scanned document. Also, many governments and organizations around the globe are digitizing historical documents and providing them as digital documents on the Internet. A typical requirement of the digitization process, when combined with subsequent image processing, is that the documents appear without skew. Skew must be corrected prior to performing character recognition of the document image.
Skew correction requires the determination of a skew angle and the modification of a document image representation based on the skew angle. With regard to skew angle determination, a first known method is based on the Hough Transform. In the Hough Transform, the digital image data that represents the document is transformed into a polar coordinate space. By identifying the maximum peak in the polar coordinate, the skew angle is directly obtained from its polar angle. The Hough Transform method is disadvantaged as it requires extensive computation time. In addition, this method is often not sensitive enough to accurately determine the skew angle.
A second type of method is described in U.S. Pat. No. 5,001,766 entitled “Apparatus and Method for Skew Control of Document Images” to Baird. In this method, a two dimensional Fourier transform of the original document image is computed, and the result is again projected to polar coordinates as in the Hough transform. The maximum of the projected values gives the angle of skew. This method has been found to provide high accuracy, up to 2 minutes of arc, but again requires considerable processing time and resources.
More recent research involving skew detection has focused on the use of connected component analysis as in U.S. Pat. No. 7,336,813 entitled “System and Method of Determining Image Skew Using Connected Components” to Prakash et al. This approach depends on text and graphic image separation, which is not characteristic of all images, and can be computationally intensive. Still other methods are constrained for use only with binary image input (1 bit/pixel) as in U.S. Pat. No. 6,985,640 entitled “Parallel Non-Iterative Method of Determining and Correcting Image Skew” to Schweid and in U.S. Pat. No. 7,142,727 entitled “Non-Iterative Method of Calculating Image Skew” to Notovitz et al.
Since the work disclosed in the Baird '766 patent, there appears to be little interest in utilizing the Fourier type transform for addressing the problem of skew angle detection. One exception is the work of G. Peake and T. Tan in “A General Algorithm for Document Skew Angle Estimation,” Proc. International Conference on Image Processing (ICIP '97), Volume 2, p. 230 (1997). In the Peake and Tan approach, a method provided for calculating the skew angle of scanned document images. The method is designed to be insensitive to document layout, line spacing, font, graphics/images and, most importantly, to the language or script of the document. This is achieved by examining the Fourier spectra of blocks of the document image for peak pairs corresponding to the angle of skew. From a histogram compiled over all blocks in the document image the correct skew angle can be determined to within approximately 0.5 degrees, regardless of document script, even when the image contains considerable graphical information.
One problem with the algorithm described in the Peake and Tan article is that for each block of data examined, a Fourier Transform must be computed and peaks must be identified and managed thereafter. In practice, this calculation sequence proves to be computationally prohibitive. For example, for a 4096×4096 pixel input image with a 256×256 pixel block size, 256 individual Fourier transforms must be calculated.
Thus, there is a need to address the problem of skew angle detection with a method that provides improved accuracy over conventional techniques and that is computationally efficient.