The invention relates generally to a system and method for processing scanned documents to identify regions that may be processed in different manners, and more particularly, to a method, system, and article of manufacture for processing scanned documents using zero-crossing region filtering in order to identify background and foreground areas of a scanned image for separate processing.
Scanning documents to generate images that represent the contents of these documents are becoming increasingly common. Processing these images to extract useful information and data that may be manipulated using any number of application programs continues to be an area of processing that is in need of advancement if this type of processing is to become more prevalent. The processing of image data generated through the scanning of documents encounters several different types of challenges. First, the shear size of the image data requires significant amounts of data storage to maintain the data. Second, the size of the data implies that a significant amount of computational resources are required to process any given image. Finally, the complexity of images containing multiple types of data increases the likelihood that the data identification and extraction process may require processing in addition to simple character and vector graphics recognition.
Current data identification and data extraction processes work on images that are known to contain a single type of data. For example, a scanned image containing type-written text may be processed by an optical character recognition application to generate a text file that may be edited. Similarly, graphics data within scanned images that represents vectored graphics may be processed to generate usable data. When, however, these types of data are combined, or when these types of data are superimposed upon complex bit-mapped graphical data such as digital photographs, these applications are not nearly as successful at extracting the desired data.
Similarly, large data files, such as ones generated when images are scanned, may be compressed using a large number of compression processes. Each of these compression processes possess different characteristics regarding the amount of data compression achieved when it is applied to various types of data as well as possess different characteristics regarding the degree to which the original data may be reconstructed from the compressed data. These facts give rise to the use of different compression algorithms to compress different types of data depending upon whether one needs to maximize compression or to minimize any differences from the original and uncompressed versions of the data.
Most scanned documents and images, however, are constructed using some image elements that may be compressed in a manner that maximizes compression of the data and also using other image elements that may be desired to be uncompressed as accurately as possible. This fact is best understood by realizing that most images can be considered to be made up of elements which are considered background elements and also made up of elements that may be more important such as foreground elements. Background elements may be compressed in a manner that maximizes data compression as these elements are not characterized as the most important set of elements in the image. Similarly, the more important elements may, at the same time, be characterized as foreground elements to allow these, presumable smaller number of elements, to be compressed more accurately at a cost of requiring additional data to represent this foreground data. When text is present within an image, the text related data may need to be separated from the other data in order to permit an OCR process to recognize the text from the scanned data. In this situation, the text-related image elements correspond to foreground data and the non-text data-related image elements correspond to background data.
At present, scanning systems do not possess processes for identifying elements that correspond to both foreground image elements and background image elements within large classes of complex image data. For the reasons discussed above, such a process is useful in large class of image processing applications such as OCR processing and efficient data compression.
The present invention relates to a method, system, and article of manufacture for processing scanned documents using zero-crossing region filtering in order to identify background and foreground areas of a scanned image for separate processing.
A system in accordance with the principles of the present invention includes a computing system for identifying image elements from pixels within a digital input image to permit the input image to be separated into two or more images. The computing system has an image memory block for storing digital images, a filtering module for filtering the digital input image to generate a filtered image, a contrast module for computing a local contrast value for each pixel within the filtered image, a zero crossing module for generating a zero-crossing image using the filtered image and the local contrast image, and a connected component module for identifying regions of connected component pixels, the connected component pixels from contiguous pixels having filtered values greater than zero, filtered values less than zero, and filtered values equal to zero.
Other embodiments of a system in accordance with the principles of the invention may include alternative or optional additional aspects. One such aspect of the present invention is a method and computer data product encoding instructions for identifying image elements from pixels within a digital input image to permit the input image to be separated into two or more images. The method filters the input image to generate a filtered image, thresholds the filtered image at zero to generate a zero crossing image, generates a local contrast image of the filtered image, generates a local contrast image mask using a pre-determined threshold value, the local contrast image mask having pixel values equal to a 1 if the pixel values within the local contrast image are greater than the pre-determined threshold and the local contrast image mask having pixel values equal to a 0 if the pixel values within the local contrast image are less than the pre-determined threshold, generates a processed zero-crossing image corresponding to the zero crossing image having pixel values filtered pixel values of a large size using the filtered image and the local contrast image mask, identifies connected component regions from contiguous pixels having filtered values greater than zero, filtered values less than zero, and filtered values equal to zero, and classifies the connected component regions as corresponding to foreground image elements and background image elements. The zero crossing image have filtered values greater than zero, filtered values less than zero, and filtered values equal to zero. The local contrast image calculates a value for each pixel by determining the maximum value for an absolute value for a difference between a pixel in the second filtered image and one or more of its neighboring pixel values.
These and various other advantages and features of novelty which characterize the invention are pointed out with particularity in the claims annexed hereto and form a part hereof. However, for a better understanding of the invention, its advantages, and the objects obtained by its use, reference should be made to the drawings which form a further part hereof, and to accompanying descriptive matter, in which there are illustrated and described specific examples of an apparatus in accordance with the invention.