More and more documents are stored in image, or pixel, format instead of ASCII code since storage media, such as CD-ROM, is becoming less expensive. These imaged documents are able to be used for reference, searching, or distribution. The stored image of the documents is usually captured by an input device such as a scanner or a digital camera. However, image distortion is a problem when the document content in the image is captured by a scanner or, even worse, by a digital camera.
FIG. 1-A is a block diagram depicting typical components of a scanner. A scanner is typically used to capture an image of a document 110. A document 110 is placed on the scanner plate 112. A scan head 120, which is generally comprised of an optical subsystem 122 and a charge-coupled device (“CCD”) 124, is moved across the document 110. Although FIG. 1A depicts only a two dimensional view, the scan head 120 may move across the document in both the direction illustrated by arrow 114 and in a direction orthogonal to arrow 114. The optical subsystem 122 focuses light reflected from document 110 onto a CCD 124. CCD 124 is often implemented as a two-dimensional array of photosensitive capacitive elements. When light is incident on the photosensitive elements of the CCD 124, charge is trapped in a depletion region of the semiconductor elements. The amount of charge associated with the photosensitive capacitive elements is related to the intensity of light incident on the respective elements received over a sampling period. Accordingly, the image is captured by determining the intensity of incident light at the respective photosensitive capacitive elements via sampling the elements. The analog information produced by the photosensitive capacitive elements is converted to digital information by an analog-to-digital (A/D) converter 130. An A/D converter 130 may convert the analog information received from CCD 124 in either a serial or parallel manner. The converted digital information may be stored in memory 140. The digital information is then processed by a processor 150 according to control software stored in ROM 180. The user may control scanning parameters via user interface 170 and the scanned image is outputted through output port 160.
A block diagram of a digital camera is depicted in FIG. 1B. An optical subsystem 122 of a digital camera may be used to focus light reflected from a document 110 onto a CCD 124, much as in the scanner. In other digital cameras, devices other than a CCD are used to capture the light reflected from the image, such as CMOS sensors. In the context of a digital camera, as opposed to a scanner, the optical subsystem 122 is not moved along the surface of the document, as in a scanner. Rather, in a digital camera, the optical system 122 is generally stationary with respect to the object, such as a document, to be imaged. In addition to digital cameras, photographs captured from film-based cameras may also be digitized.
Cameras offer significant advantages over scanners for capturing document images and other images. For example, cameras are generally more portable than scanners. In addition, because scanners require a captured image to be placed on the scanner plate, cameras are capable of capturing a wider array of images than scanners. However, the use of cameras creates difficulties in image capturing that do not exist when using a scanner. For example, light conditions vary when using a camera, whereas the light conditions are generally controlled in scanners. In addition, use of a camera introduces image distortions, which may depend on various variables, such as the angle of the camera relative to the image, the lens used by the camera and its distance from the image, whether the image including a document is situated on a flat or curved surface and other factors. Because the scanner utilizes a moving scanner head, at a fixed distance from a document to be imaged, these distortions do not generally occur in scanners.
Much research has been done on solving the problem of image distortion. Brown and Seales proposed a general de-skewing algorithm for arbitrarily warped documents based on 3D images. (“Image Restoration Arbitrarily Warped Documents,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, No. 10 (2004).) Zhang, et al developed a depth-from-shading algorithm to process document images captured by a flatbed scanner. (“Restoration of curved Document Images Through 3D Shape Modeling,” Proc. of the 6th International Conference on Document Analysis and Recognition, pp. 10-15 (2004).) But this technique is highly dependent on the lighting condition and, therefore, is not suitable for images captured with a digital camera.
Recognizing that digital cameras are more convenient input devices compared to scanners, researchers have developed models to reduce image distortion problems in images captured by digital cameras. For example, Cao, et al developed a parametrical model to estimate the cylinder shape of an opened book. (“Rectifying the Bound Document Image Captured by the Camera: A Model Based Approach,” Proc. of the International Conference on Document Analysis and Recognition, pp. 71-75 (2003).) A major limitation of using this technique is that the model only works when the lens plane of the camera lens is parallel to the surface of the imaged book. Liang, et al have developed a developable surface to model the page surface of a book and exploit the properties (parallelism and equal line spacing) of the printed textual content on the page to recover the surface shape. (“Flattening Curved Documents in Images,” International Conference on Computer Vision and Pattern Recognition, pp. 338-345 (June 2005).) By using this technique, the lens plane of the camera lens is no longer required to be parallel to the surface of a book. However, the models used by both Cao and Liang to correct the distortion on an imaged document are based on text line information. In other words, these models are highly dependent on the existence of text lines in the imaged book. If a page of book has many pictures or equations instead of text lines, the Cao and Liang models will not work well.
Therefore, a need continues to exist for an improved apparatus and method for capturing images of documents that utilizes the advantages of cameras over scanners, yet reduces the distortion presented by capturing document images via a camera as opposed to a scanner. Preferably, the apparatus and method should be capable of reducing distortion in a captured image independent of whether text lines are present on the imaged document, thereby allowing for the correction of distortion in a captured image of document with pictures and equations. In addition, the apparatus and method preferably should not be restricted to images that are generated when the lens plane of a camera lens is parallel to the surface of a book.