The present invention relates generally to a system for processing document images, and more particularly, to an improved method of image processing the document images utilizing a fuzzy logic classification process.
In the reproduction of images from an original document or images from video image data, and more particularly, to the rendering of image data representing an original document that has been electronically scanned, one is faced with limited reflectance domain resolution capabilities because most output devices are binary or require compression to binary for storage efficiency. This is particularly evident when attempting to reproduce halftones, lines, and continuous tone (contone) images.
An image data processing system may be tailored so as to offset the limited reflectance domain resolution capabilities of the rendering apparatus, but this tailoring is difficult due to the divergent processing needs required by different types of images which may be encountered by the rendering device. In this respect, it should be understood that the image content of the original document may consist of multiple image types, including halftones of various frequencies, continuous tones (contones), line copy, error diffused images, etc. or a combination of any of the above, and some unknown degree of some or all of the above or additional image types.
In view of the situation, optimizing the image processing system for one image type in an effort to offset the limitations in the resolution and the depth capability of the rendering apparatus may not be possible, requiring a compromised choice which may not produce acceptable results. Thus, for example, where one optimizes the system for low frequency halftones, it is often at the expense of degraded rendering of high frequency halftones, or of line copy, and visa versa.
To address this particular situation, xe2x80x9cprior artxe2x80x9d devices have utilized automatic image segmentation to serve as a tool to identify different image types or imagery. For example, in one such system, image segmentation was addressed by applying a function to the video, the output of which was used to instruct the image processing system as to the type of image data present so that it could be processed appropriately. In particular, an auto-correlation function was applied to the stream of pixel data to detect the existence and estimate the frequency of halftone image data. Such a method automatically processes a stream of image pixels representing unknown combinations of high and low frequency halftones, contones, and/or lines. The auto-correlation function was applied to the stream of image pixels, and for the portions of the stream that contain high frequency halftone image data, the function produced a large number of closely spaced peaks in the resultant signal.
In another auto-segmentation process, an auto-correlation function is calculated for the stream of halftone image data at selected time delays which are predicted to be indicative of the image frequency characteristics, without prior thresholding. Valleys in the resulting auto-correlated function are detected to determine whether a high frequency halftone image is present.
An example of a xe2x80x9cprior artxe2x80x9d automatic segmentation circuit is illustrated in FIG. 6. The basic system as shown in FIG. 6 is made up of three modules. Input information stored in a data buffer 10 is simultaneously directed to an image property classifying section 20, the first module, and an image processing section 30, the second module. The image property classifying section 20, is made up of any number of submodules, (e.g. auto-correlator 21 and discriminator 22), which determine whether a block of image pixels stored in the data buffer 10 is one type of imagery or another, (e.g. halftone, line/text, or contone). In parallel with the image property classifying section 20, the image processing section 30 is made up of any number of sub-processing sections, (e.g. high frequency halftone processor 31 low frequency halftone processor 32, line/text processor 33, or contone processor 34), which perform image processing operations on the same block of image pixels as section 20. Each image sub-processing section performs image processing operations that are adapted to improve the image quality of a distinct class of imagery. The third module, control section 41, uses the information derived from the image classifying section 20, to control the image processing section 30. In other words, the control section 41 acts like a multiplexer and selects the proper processed image data according to the image classification determined by the image classifying section 20.
The decision as to what class of imagery image data belongs to is typically binary in nature. For example, in a conventional image segmentation scheme image property classifying section 20 classifies image data as one of three classes of imagery, (high frequency halftone, low frequency halftone, or contone). Depending on those classification, image data is processed according to the properties of that class of imagery is selected, (either low pass filter and re-screening if it""s a high frequency halftone, threshold with a random threshold if it is a low frequency halftone, etc.). Also, assuming that the decision as to which of the three classes of imagery image data belongs is based on a single image property, the peak count of the input image data, the resulting image classification decision of the peak count image property is made by thresholding the peak count into three classes of imagery.
Consequently, the control section 40 decides the type of image processing the image data requires depending on the decision made by the classification section 20. Thus, the output of classification section 20 is quantized to one of three possibilities. The control section 40 selects the output from one of the three image sub-processing sections based upon this classification.
Based on the nature of conventional image classification systems, the classifying information, gathered over a context of many pixels, changes gradually. But in the process of comparing this classifying information with a classification threshold one could create abrupt change in the classes. This abrupt decision making, which produces a forced choice among several distinct alternative choices, is a primary reason for the formation of visible artifacts in the resulting output image. Most transition points or thresholds are selected so that an image can be classified as one class of imagery with a high degree of certainty; however, those classes of imagery that cannot be classified with such certainty have multiple transition points or a transition zone.
Using only one point to define a transition zone results in the formation of visible artifacts in the resulting output image if the output image spans in the transition zone. Although it is possible to shift or make the transition zone narrower so that there is less chance that an image falls into the zone, there exists limitations on how narrow the zone can be made. The narrowing of the transition zone is the decreasing of noise and/or variation in the information used to classify so as to narrow the area over which classification is not xe2x80x9ccertainxe2x80x9d, resulting in less switching between classifications.
Moreover, the classification of real images covers a continuum from well below to well above thresholds between classifications. This means that there are areas of an image which are, for example, just above a threshold. Variations in the gathered (lowpass filtered) information due to xe2x80x9cflawsxe2x80x9d in the input video or ripple due to interactions between the area of image being used for the classification process and periodic structures in the input video results in areas falling below the threshold. With discrete classification, this results in a drastically different classification, thereby resulting in artifacts in the rendered image.
Thus, it is desirable to classify image data in a fuzzy manner, slowly sliding the classification from one classification to the other, reflecting the information that has been gathered. Artifacts in the resulting rendered image will now be soft and follow the contours of the image, and so the artifacts will not be objectionable
In general, the xe2x80x9cprior artxe2x80x9d describes the control section 40 as essentially having a switch. Since the image processing steps performed for each class of imagery are different depending on the classification given to each block of input image pixels, the switch or multiplexer allows data residing at the output of the image processor 30 to be directed to an output buffer 50 depending on the decisions made by the imagery classifying section 20 which is are received as signals on lines 23 and 24. This type of binary decision making is rigid and results in image segmentation decisions that do not fail gracefully and consequently form visible artifacts in the output image.
To address this forming of visible artifacts in the rendered output image, it has been proposed to utilize a probabilistic segmentation process to allow the image processing system to fail more gracefully when incorrect segmentation decisions are made. An example of such a probabilistic segmentation system is illustrated in FIG. 2.
FIG. 2 shows a block diagram of a conventional image processing system which incorporates a probabilistic classification system. As illustrated in FIG. 2, the conventional system receives input image data derived from any number of sources, including a raster input scanner, a graphics workstation, an electronic memory, or other storage elements, etc. In general, the image processing system shown in FIG. 2 includes probabilistic classifier 25, image processing section 30, an image processing and control mixer 41.
Input image data is made available to the image processing system along data bus 15, which is sequentially processed in parallel by probabilistic classifier 25 and image processing section 30. Probabilistic classifier 25 classifies the image data as a ratio of a number of predetermined classes of imagery. The ratio is defined by a set of probability values that predict the likelihood the image data is made up of a predetermined number of classes of imagery. The probabilities 27, one for each predetermined class of imagery, are input to the image processing mixer or control unit 41 along with image output data from image processing section 30.
Image processing section 30 includes units 31, 32, and 34 that generate output data from the image data in accordance with methods unique to each predetermined class of imagery. Subsequently, mixer 41 combines a percentage of each class of output image data from units 31, 32, and 34 according to the ratio of the probabilities 27 determined by classifier 25. The resulting output image data for mixer 41 is stored in output buffer 50 before subsequent transmission to an image output terminal such as a printer or display.
Initially, the stream of image pixels from an image input terminal (IIT) is fed to data buffer 10. The image data stored in buffer 10 is in raw grey format, for example, 6 to 8 bits per pixel. A suitable block size is 16 pixels at 400 spots per inch, or 12 pixels at 300 spots per inch. Too large of a sample size results in the inability to properly switch classification in narrow channels between fine structures in the image, or to switch soon enough when moving from one classification to another. An example of this problem is small text forming a title for a halftone image. Given a font size which is large enough to read, a good layout practice of leaving white space which is at least a half a line between the text and the image, a one millimeter block turns out to be a good compromise with most documents. Thus, too large a sample size results in classification transitions at the edge of objects to be larger than the whitespace between the objects, resulting in inappropriate classification and rendering.
With reference FIG. 3, the conventional probabilistic classifier 25 is shown in detail. The block of image pixels stored in buffer 10 is transmitted to a characteristic calculator 28 through data buffers 15. Calculator 28 provides an output value that characterizes a property of the image data transmitted from buffer 10, such as its peak count. In one embodiment, a characteristic value is determined by calculator 28 that represents the peak count of the block of image data. The peak count is determined by counting those pixels whose values are the non-trivial local area maximum or minimum in the block of image data. First local area maximum or minimum pixel values are selected depending on whether the average value of all the pixels in the block of image data is lower or higher than the median value of the number of levels of each pixel.
After calculator 28 evaluates the peak count of the image data, probability classifier 29 determines three probability values 27 that correspond to each image type associated with the peak count as expressed by the characteristic function stored in memory 26. The characteristic function, determined with apriori image data, represents a plurality of probability distributions that are determined using a population of images. Each probability distribution depicts the probability that a block of image data is a certain type given the occurrence of an image property, a peak count.
For example, the characteristic function stored in memory 26 can be represented by the graph shown in FIG. 4, which relates the probability distributions for a contone 1, low frequency halftone 2, and high frequency halftone 3 to the occurrence of a particular image characteristic, which in this example is a peak count. The characteristic function stored in memory 26 can be adjusted using input control 18. Using control 18, the resulting output image stored in buffer 50 can be altered by modifying the characteristic function representing the different classes of imagery evaluated by the image processing system 30.
Subsequently, probability classifier 29 determines each probability value by evaluating the probability distribution of each image type represented by the characteristic function stored in memory 26. After determining the probability values, classifier 29 outputs these results to image processing mixer or control 41.
The image processing section of FIG. 2 operates concurrently with the probabilistic classifier 25 on the image data stored in buffer 10. Image processing section 30 includes a high frequency halftone processing unit 31, a low frequency halftone processing unit 32, and a contone processing unit 34. Each processing unit processes all image data in accordance with a particular image type. Each of the processing units 31, 32, and 34 generates output blocks of unquantized video data.
Image processing control 41 mixes the data output blocks to form a composite block of output image signals that is stored in output buffer 50. The manner in which the output blocks are mixed is characterized by a ratio defined by the probability determined by the probabilistic classifier 25.
FIG. 5 shows the conventional image processing mixer 41 in detail. Mixer 41 multiplies the output blocks with the probability, using multipliers 42, 43, 44. The resulting output from each multiplier is representative of a percentage or ratio of each output block, the sum of which defines a composite block of output image signals. The composite block of output image signals is formed by adding the output of the multipliers using adder 45 and by subsequently quantizing the sum of adder 45 using quantizer 47. The resulting image block output by quantizer 47 is stored in output buffer 50 before subsequent transmission for output to an image output terminal having limited resolution or depth.
The above-described image classification system utilizes a probabilistic approach to classify the image data. Such an approach presents problems in that the classification of the image data is mutually exclusive, the image data is classified as a particular type in absolute terms eventhough the probability of the decision being correct is just over 50%. This results in difficulties in trying to design an image processing system which will process the image data without visible artifacts in the rendered image when the decision on the image type does not have a high confidence.
Not only is image classification important to a digital reprographic system, rendering based on this classification is important. One such component of the rendering system is digital filtering. The digital filtering process should be both efficient and low cost. Moreover, the filter design should have some non-separable and/or time-varying characteristics so that the filter can be used in a fuzzy segmentation system However, trying to achieve one goal or another can adversely impact the other goal. Various approaches have been devised for the implementation of digital filtering techniques which try to solve minimize the adverse impacts. These techniques will be discussed briefly below.
In a xe2x80x9cprior artxe2x80x9d digital filtering technique, a two-dimensional finite impulse response filter having a plurality of filter portions of essentially identical construction are arranged in a parallel configuration. A de-multiplexer separates an input data signal comprising consecutive digital words and supplies each digital word in sequence to a separate filter portion. Subsequently, a multiplexer, coupled to the output of the filter portions, selectively outputs the filtered data from each filter portion in a sequence corresponding to the order of separation of the input data, thereby resulting in a filtered version of the original input data.
The system described above all has the limitation with respect to either speed or high cost. In view of these limitations, it has been proposed to provide a plurality of one-dimensional transform units that may be selectively combined with an additional one-dimensional transform unit to produce a plurality of distinct two-dimensional filters, any one of which is selectable on a pixel by pixel basis. Moreover, this proposed conventional system has the added advantage of providing two-dimensional finite impulse response filters without employing multiple, identically constructed two-dimensional filters arranged in a parallel fashion, thereby substantially reducing the complexity and cost of the filter hardware. To get a better understanding of this conventional system, the conventional system will be described below.
The conventional system, as illustrated in FIG. 1, includes image processing module 20 which generally receives offset and gain corrected video through input line 22. Subsequently, the image processing module 20 processes the input video data according to control signals from CPU 24 to produce the output video signals on line 26. As illustrated in FIG. 1, the image processing module 20 may include an optional segmentation block 30 which has an associated line buffer 32, two-dimensional filters 34, and an one-dimensional rendering block 36. Also included in image processing module 20 is line buffer memory 38 for storing the context of incoming scanlines.
Segmentation block 30, in conjunction with the associated scanline buffer 32, automatically determines those areas of the image which are representative of halftone input region. Output from the segmentation block, (video class), is used to implement subsequent image processing effects in accordance with a type or class of video signals identified by the segmentation block. For example, the segmentation block may identify a region containing data representative of an input high frequency halftone image, in which case a lowpass filter would be used to remove screen patterns, otherwise, a remaining text portion of the input video image may be processed with an edge enhancement filter to improve fine line and character reproduction when thresholded.
Two-dimensional filter block 34 is intended to process the incoming, corrected video in accordance with the predetermined filtering selection. Prior to establishment of the required scanline content, the input video bypasses the filter by using a bypass channel within the two-dimensional filter hardware. This bypass is necessary to avoid delirious effects to the video stream that may result from filtering of the input video prior to establishing the proper context.
Subsequent to two-dimensional filtering, the one-dimensional rendering block is used to alter the filtered, or possibly unfiltered, video data in accordance with selected one-dimensional video effects. One-dimensional video effects include, for example, thresholding, screening, inversion, tonal reproduction curve (TRC), pixel masking, one-dimensional scaling, and other effects which may be applied one-dimensionally to the steam of video signals. As in the two-dimensional filter, the one-dimensional rendering blocks also includes a bypass channel where no additional effects would be applied to the video, thereby enabling the received video to be passed through as an output video.
Therefore, it is desirable to implement an image classification system which provides a truer classification of the image type and the image types are not necessarily mutually exclusive. Such a system would incorporate fuzzy logic, thereby allowing image data to be classified as being a member of more than one image class. This feature is critical in areas where the image goes from one image type to another. Moreover, it is desirable to implement a image processing and rendering system which takes advantage of the fuzzy classification system.
One aspect of the present invention is a method for detecting peaks in a video stream of image data. The method determines if a pixel value v(i,j) from the video stream is greater than pixel values adjacent to the pixel; determines if the pixel value v(i,j) from the video stream is greater than pixels values adjacent to the pixel in a particular direction; and classifies the pixel value v(i,j) as a peak value when the pixel value v(i,j) from the video stream is greater than pixel values adjacent to the pixel and the pixel value v(i,j) from the video stream is greater than pixels values adjacent to the pixel in a particular direction.
A second aspect of the present invention is a system for rendering a pixel in a video stream of image data. The system includes peak detection means for determining if the pixel having a pixel value v(i,j) is a peak value; classification means for classifying the pixel as a particular image type based on classification information received from said peak detection means; processing means for image processing the pixel value v(i,j) based on the image type of the pixel; and print means for rendering the processed pixel value on a recording medium. The peak detection means includes first means for determining if the pixel value v(i,j) is greater than pixel values adjacent to the pixel, second means for determining if the pixel value v(i,j) is greater than pixel values adjacent to the pixel in a particular direction, and third means for classifying the pixel value v(i,j) is a peak value when the first and second means make positive determinations.
Further objects and advantages of the present invention will become apparent from the following descriptions of the various features of the present invention.