The present invention relates generally to the document processing art, and more particularly to a method and apparatus for classifying a scanned image or image region as either a synthetic graphic or a natural picture based on texture information obtained from the image or region, and will be described with particular reference thereto.
Document processing systems (DPS) refers to a set of devices that construct, produce, print, translate, store, and archive documents and their constituent elements. Such devices include printers, copiers, scanners, fax machines, electronic libraries, and the like. The present invention addresses situations particularly relevant to printing systems and discusses them as the prime example of a DPS, but the present invention should not be construed to be limited to any such particular printing application. Any DPS is intended to benefit from the advantages of this invention.
Natural pictures differ from synthetic graphics or graphical images in many aspects, both in terms of visual perception and image statistics. Synthetic graphics are usually very smooth (i.e. relatively uniform or linear pixel values) within a given image region or neighborhood, and the edges or boundaries that separate different synthetic graphical regions are typically very sharp (i.e. relatively large difference in local pixel values). Further, synthetic graphics contain textures only in rare cases. On the contrary, natural pictures have regions that are often more noisy and texture rich (i.e. relatively large variation in local pixel values), and generally transition slower from region to region within a natural image.
Information about the origin of a scanned image is usually unavailable to the document processing system. However, in processing scanned images, it is sometimes beneficial to distinguish images from different origins, e.g. synthetic graphics versus natural pictures. For example, in color correction or enhancement, more emphasis is placed on vividness for a synthetic graphical original, while for a natural picture, the focus is more on the naturalness of the image.
Accordingly, it is considered desirable to develop a new and improved method and apparatus for classifying a scanned image or image region as either a synthetic graphic or a natural picture based on texture information, that meets the above-stated needs and overcomes the foregoing difficulties and others while providing better and more advantageous results.
More particularly, the present invention provides a new and improved method and apparatus for classifying a scanned image or region of an image as either a synthetic graphic or a natural picture, based on texture information for the subject image or region. The present invention analyzes texture features to determine whether a scanned image was originally a synthetic graphic, or a natural picture. A classifier is then generated based on the measurement of texture energy.
In accordance with one aspect of the present invention, a method for classifying an input image or region thereof as either a synthetic graphic or a natural picture, is disclosed. The method includes a) low-pass filtering image data representative of the input image or region thereof to produce low-pass filtered pixel values; b) determining a smoothness value for each of a plurality of low-pass filtered pixel values; c) generating histogram data from the smoothness values; d) determining a texture metric for the input image or region thereof from a subset of the histogram data; and e) thresholding the texture metric to classify the input image as either a synthetic graphic or a natural picture.
In accordance with another aspect of the present invention, a document processing system for classifying an input image or region thereof as either a synthetic graphic or a natural picture, is disclosed. The system includes an image input subsystem, a processing subsystem for processing image data provided by the image input subsystem, and software/firmware means operative on the processing subsystem for a) low-pass filtering image data representative of the input image or region thereof to produce low-pass filtered pixel values; b) determining a smoothness value for each of a plurality of low-pass filtered pixel values; c) generating histogram data from the smoothness values; d) determining a texture metric for the input image or region thereof from a subset of the histogram data; and e) thresholding the texture metric to classify the input image as either a synthetic graphic or a natural picture.
An advantage of the present invention is the provision of a document processing system and a method that classifies an input image or region thereof as either a synthetic graphic or a natural picture for use in downstream image processing.
Still further advantages of the present invention will become apparent to those of ordinary skill in the art upon reading and understanding the following detailed description of the preferred embodiments.