1. Field of the Invention
The present invention relates to image processing. More specifically, the present invention relates to a method and apparatus for facilitating the removal of noise from a digital image.
2. Related Art
As businesses and other organizations become more computerized, it is becoming increasingly common to store and maintain electronic versions of paper documents on computer systems. The process of storing a paper document on a computer system typically involves a “document-imaging” process, which converts a copy of the paper document into an electronic document. This document-imaging process typically begins with an imaging step, wherein document page-images are generated using a scanner, a copier, a camera, or any other imaging device. These page-images are typically analyzed and enhanced using an image-processing program before being assembled into a document container, such as a Portable Document Format (PDF) file.
Often, applications need to recognize text from the scanned page-images to facilitate subsequent document-processing operations. This is typically accomplished through an optical character recognition (OCR) process.
Unfortunately, it is very common for the performance of the OCR process to be significantly degraded by the presence of noise in scanned images. Many types of noise and noise-like artifacts arise from the printing and imaging processes. Examples of noise and noise-like artifacts may include quantization noise from the imaging light sensors, dirt on imaging device optics, ink spatters, and toner smudges.
Because of this problem, noise-removal operations are commonly applied to images prior to the OCR process. For example, a common noise-removal operation removes all blobs that are smaller than a threshold number of pixels. However, this may cause small characters such as a “period” to be removed, or may cause a particularly large noise artifact to be retained. Rarely is a fixed threshold value optimal for all character sizes. Consequently, either too much noise is left behind during the noise-removal process, or portions of a scanned image are improperly removed.
Hence, what is needed is a method and apparatus for removing noise from an image without the above-mentioned problems.