With the introduction of the Internet, digital content services are becoming increasingly popular. Companies digitize documents, books, official records, and so on, and make them available to subscribing customers. Thus, scanning and digitizing documents has become an important industry.
Several problems exist with digital documents. For example, digitized documents are sometimes difficult to read. Reasons for this include poor source documents (e.g., the documents may be too dark, too light (faded), or be old and yellowed, fragile or degenerating) and poor scanning equipment or processes. Persons with less than ideal eye sight often struggle to read digitized documents, and eye strain, even in people with excellent vision, may reduce the effectiveness of reading digitized documents. Many digitized documents are also difficult to read/interpret by automated processes such as OCR. Further, digitized documents are often large in size and require substantial resources to store and significant time to download.
Others have attempted to solve these problems in several ways. For example, some have used “thresholding” to improve the quality of digitized images. Thresholding is used to convert a gray-scale or color document to a black and white (bitonal) image. The resulting image has very high contrast and is highly compressible. Choosing an appropriate threshold that results in a readable document, however, is an exceptionally difficult problem. Many techniques have been suggested. Some of these techniques are adaptive, that is, they analyze each area of the document independently to determine appropriate thresholds for each area. None of these techniques are reliable enough, however, (especially with hand-written documents) to provide significant confidence that the original document data will not be lost or significantly degraded. Also, adaptive techniques are often computationally expensive, prohibiting their use in real-time or semi-real time situations.
Others have attempted to use “leveling,” which involves choosing the black point and white point (and sometimes the mid-tone) of an image and then interpolating the values of the image based on those values. Auto-leveling analyzes the image to determine the black point and white point of the image automatically, typically based on the histogram of the image (noting where a significant number of values start and end in the histogram). The problem with both leveling and auto-leveling is that different parts of the image may be darker or lighter than other parts, and global image leveling may improve parts of the image without improving others. Also, because the leveling is done based on the entire range of the image, the leveling may be reduced in its effect and not result in as much contrast in the resultant image as is desired.
Still others have used brightness and/or contrast adjustments. Adjusting the brightness or contrast of a document, whether manually or automatically, has similar disadvantages to the leveling process in that this process is typically done globally across an entire image, not giving the results required for any given sub-location of the image.
For at least the foregoing reasons, improved systems and methods are needed for improving digitized images.