Documents are frequently digitized using a digital scanner, such as a flat bed scanner. The scanned documents can be printed or stored for later viewing, or can be processed with an optical character recognition method to extract textual information. Good quality results can typically be obtained for original documents that are flat, but problems can occur for pages which do not lie flat on the scanner platen. For example, if a page from a book or a magazine is scanned, the page will generally be curved near the bound edge. The curvature of the document page can result in a geometric distortion of the scanned image where image content that should have been horizontal (e.g., lines of text) may be reproduced as curved lines.
Digital cameras are increasingly being used to digitize documents. For example, a user may capture an image of a document (e.g., a page of a book) using an application on a camera phone. The problem of introducing geometric distortions due to curvature of the original document can be quite severe in many cases due to the fact that the document and camera positions have fewer constraints. These geometric distortions will frequently cause horizontal features (e.g., lines of text) in the original document to be reproduced as warped curves. Therefore, there is a need for image rectification methods that can be used to process digital images containing warped textual lines.
Liang et al., in an article entitled “Flattening curved documents in images” (Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 338-345, 2005), have described a method for correcting distortion in a document image including page warping. The method involves modeling the page surface as a developable surface and exploits the parallelism and equal line spacing properties of printed textual content. Local texture flow directions are determined based on dividing the image into small blocks and performing projection profile analysis. The method is computationally complex and requires a relatively dense distribution of textual features to enable the determination of texture flow directions throughout the document.
Shafait et al., in an article entitled “Document image dewarping contest” (2nd International Workshop on Camera-Based Document Analysis and Recognition, pp. 181-188, 2007), compare a number of different methods for dewarping a document image. A first method involves constructing an outer skeleton for text regions using Bezier curves. An image deformation is determined to warp the image based on the determined Bezier curves. A second method involves detecting words, and linking consecutive words to define text lines. Upper and lower baselines are calculated for each word, and transformation factors are determined to rotate and shift the words accordingly. A third method uses a coordinate transform model and document rectification process for book dewarping. The assumption is made that the book surface is a cylinder and a transformation function is formed based on straight lines representing the left and right boundaries of the page and curved lines representing the top and bottom boundaries of the page.
Gatos et al., in an article entitled “Segmentation based recovery of arbitrarily warped document images” (Proc. Int. Conf. on Document Analysis and Recognition, pp. 989-993, 2007), disclose a segmentation-based method for dewarping document images. A horizontal smoothing operation is performed based on a determined average character height. Words are then identified by detecting connected components. Upper and lower boundaries of the identified words are then determined and used to rotate and translate the words to form a de-warped image. The method relies on accurate determination of the orientation of the first word on each text line, which guides the alignment of the entire text line.
Tian et al., in an article entitled “Rectification and 3D reconstruction of curved document images” (Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 377-384, 2011) describe a method for rectifying images of curved documents. The method involves tracing text lines using a self-similarity measure. Text orientation is estimated using local stroke statistics. Two-dimensional warping is used to make the text lines horizontal and the text orientation vertical. The process of tracing the text lines is computationally intensive and is sensitive to the size of the searching neighborhood. It is not adapted to handle extended regions that do not contain text lines.
U.S. Patent Application Publication 2010/0073735 to Hunt et al., entitled “Camera-based document imaging,” describes a method to extract textual information from a warped document image. The method includes detecting typographical features indicating the orientation of text, and fitting curves to the text lines. A grid of quadrilaterals are constructed using vectors that are parallel to the text lines and vectors that are parallel to the direction of the vertical stroke lines. The document is dewarped by stretching the image so that the vectors become orthogonal, and the dewarped document is processed using optical character recognition. The method relies on the accurate identification of each text line.
In general, methods using a physical deformation model to rectify the deformed document image lack the flexibility to handle various deformations in different situations. Most of the methods that estimate the deformation directly from the deformed textual information heavily rely on the accurate identification of long text lines, which limits their application to documents of different types that may contain large areas without long text lines. There remains a need for a reliable and efficient method to rectify images of documents having a wide variety of deformations and may or may not include long warped textual lines.