1. Field of the Invention
The present invention relates to an image processing apparatus for separating and extracting a photographic area or a half-tone image area, and a character area from image data to be inputted, and relates to the image processing method therefor.
2. Related Background Art
For a copying machine, OCR equipment, and the like, a method has hitherto been proposed for separating a photographic area or a half-tone image area, and a character area by transforming an image into data on the spatial frequency axis. The outline of such conventional method will be described hereunder.
(1) In the preprint 93-01-02 of a study group of Gazou Denshi Gakkai (or the Video Electronics Institute), a method for separating image area is disclosed with a particular attention given to the difference in frequency characteristics of a character image and a half-tone image. In accordance with this method, an image data is at first divided into small-sized blocks of 8.times.8, and then, the discrete cosine transformation (DCT) is performed. The DCT is widely adopted for a method for encoding images of JPEG (Joint Photographic Exert Group) standard, and is used for the transformation of image data into data on frequency axis. As a result of this transformation, the coefficient of each of the blocks is made to represent the direct current component of the block by one line and one column, the horizontally directed frequency in the column direction, and the vertically directed frequency in the line direction. In each of the directions, it is indicated that the more the line (column) number increases, the higher is the intensity of frequency. Then, subsequent to the DCT performance, a zigzag scanning process is executed to transform the two-dimensional block data into one-dimensional one. This is also a method adopted for the JPEG standard. As shown in FIG. 11, scanning is performed diagonally from a lower frequency range to a higher frequency range. In the step that follows, the "zigzag rate" is worked out in accordance with the following formula: EQU ZigZag_Rate[i]ZigZag[i].times.2-ZigZag[i-1]-ZigZag[i+1] (i:1 to 63)
In continuation, the zigzag rates at the lower frequency range and higher frequency range are estimated, and defined as ZZ_Rate_moji and ZZ_Rate_HT, respectively. Then, if the discriminative condition of the formula (1) given below is satisfied, the data are determined to represent a character image. If that of the formula (2) given below is satisfied, the data are determined to represent a half-tone image. This process utilizes the characteristics of the zigzag rate: a character image has a larger value at the lower frequency range, while a half-tone image has a larger value at higher frequency range. EQU ZZ_Rate_moji+key.gtoreq.k1 (1) EQU ZZ_Rate_HT+key.gtoreq.k2 (2)
Here, for the constants k1 and k2, those experimentally defined are used. A value, which is obtainable by working out the results of determinations on the circumferential four blocks in accordance with the following formula, is used for the key; flags in the following formula are the functions that take a negative value if the determination result indicates a character image, and a positive value if it indicates a half-tone image: EQU key=0.25(flag(top)+flag(left))+0.125 (flag(second from the left)+Flag(top aslant))
(2) In an article titled "a DCT encoding method using adaptive quantization" of the magazine No. 5, Vol. 20 of the Gazou Denshi Gakkai (or the Video Electronics Institute), there is disclosed a method for implementing the prevention of character image deterioration in order to enhance the compression rate of the half-tone image area by separating the character image and the half-tone image and then, by switching over the quantization tables of image compressions. In this method, image data is at first divided into blocks each in size of 8.times.8. Then, DCT is performed. Subsequently, the sum of absolute values of coefficients contained in the areas 100 to 104 shown in FIGS. 12A to 12E is calculated. Then, if the maximum value of the sum of the coefficients contained in areas 101 to 104 is greater than the area 100, and the maximum value of the sum of the coefficients contained in areas 101 to 104 is greater than a given threshold value A, this block is determined to represent a half-tone image. Also, in FIG. 12F, the sum of the absolute values of coefficients contained in the area 105 is greater than a threshold value B. Therefore, it is not determined that this block represents any half-tone image. This block represents a character image.
(3) In a "facsimile apparatus" specified in Japanese Patent Laid-Open Application No. 2-202771, there is disclosed a method for implementing the distinctive separation between a binary image area and a half-tone image area. In this apparatus, the image area determination parameter unit functions to divide an image data into small blocks each in size of 4.times.4, and execute a two-dimensional Hadamard transformation. Now, given Yij as coefficient factor of Hadamard transformation, the image separation parameter L is worked out by the following formula:
L=.SIGMA..SIGMA.Yij.sup.2 (i+j=3, 4, 5, 6)
Then, in accordance with the value of L, the slice level of binary coding is determined. This is due to the fact that "the transformation result hypothetically defined for a binary image area has a greater energy with respect to the higher spatial frequency", that is, it is indicated that the value L is greater in the binary image area, while the value L is smaller in the half-tone image area.
However, there is such a drawback in the conventional methods described above that the character extraction rate is lowered when an image for which non-reversible compression is processed is mixed with the one for which it is not processed in a representation. In other words, the non-reversible compression process is to quantize the high frequency component of an image and discarded it to make compression. As a result, the frequency distribution of the expanded image becomes different from that of the original one.
However, in accordance with the conventional methods, these images are determined by means of threshold values that are uniformly set. Therefore, erroneous sampling is often caused. Particularly, when the same values is taken by the image separation parameter L of the binary image area of an image having compression hysteresis and the image separation parameter L of the half-tone image area of the original image take the same value, it is made impossible to set any appropriate threshold values in this respect.
In the meantime, for an apparatus structured to demand its user to take this set up procedure, the user should establish each time whether or not a target image has any compression hysteresis. There is then a drawback that the operativity thereof becomes extremely unfavorable.
FIG. 13 is a view which shows one example of the threshold value set up for each of the images. In FIG. 13, the column 110 indicates the averaged value of the image area separation parameters Ls of binary image areas of each images. The column 111 indicates the averaged value of Ls of the half-tone areas. The column 112 indicates each of the threshold values to make separation. The column 113 indicates each of the sampling rates. Also, the sampling result 114 represents an example in which an original image having no compression hysteresis is processed. Sampling results 115 and 116 represent examples in which images having compression hysteresis are processed, respectively. In the sampling result 114, the determination threshold values are set at the averaged value of the Ls of the binary image area and half-tone image area for the exemplified original image. Thus, a sampling rate of 90% is obtained. In the sampling result 115, the determination threshold values are set likewise to obtain a sampling rate of 90% for the image having compression hysteresis. Here, in a case where an image having compression hysteresis and an original image are mixed in a representation, these are all determined by the application of the threshold values of the original image. If such determination is made, almost all the binary image areas should be determined as half-tone image areas as indicated by the sampling result 116. As a result, the sampling rate becomes extremely unfavorable.