The present invention relates to a color image data processing apparatus for processing color image data, and in particular, one that can adjust the color blur of a chromatic color at a time of reading processing color image data to accurately distinguishing a character used on a color document such as an OCR (optical character recognition) slip from a colored frame.
Recent developments in office automation have brought about the handling of a colored document as document information. For instance, a character frame of a slip is printed in a different color, according to the kind of slip.
A format such as that of a character frame of an OCR for reading hand written characters is printed in a special color called a dropout color. This format is optically eliminated at a reading time so that only the character is recognized.
In other words, the character frame of an OCR slip that sets the position of the written character is printed in a dropout color that can be clearly identified by the human eye, but cannot be distinguished from the background color by an OCR sensor. The optical filter determined by the spectral characteristics of the written character color and those of the dropout color is set in front of the OCR sensor. The character frame is thereby optically eliminated when the original is read, so that only the character to be recognized is read.
However, such an existing OCR has the problem of having to select the characteristics of the dropout color and the optical filter according to the means of writing the character. Furthermore, it completely fails to use the information indigenous to the character frame, since the character frame printed in the dropout color is eliminated from the image data at a time of reading. For example, if the character frame is also recognized, the cutout in a character unit necessary for a character identification is quite easily executed. Yet, the existing OCR has the problem that it cannot use the character frame information at all. This problem becomes extremely large when an OCR slip is printed in multiple dropout colors so that it contains further effective information.
Therefore as disclosed in the Japanese patent application No. 1988-220067, this applicant proposed to extract a character according to the achromatic color value and multiple color information according to the chromatic color hue, after finding the color information of each picture element of the image data of the original in a hue, a saturation and a value and after separating an achromatic color and a chromatic color according to a saturation.
The method in this proposal is an HSV type identification method, in which an original such as a slip is read as an RGB (red, green and blue) image signal that is converted to an HSV (hue, saturation, value) signal. After an achromatic color and a chromatic color are separated by a predetermined saturation threshold, the value of the achromatic color is used for a character identification and the chromatic color is outputted as a two value image, or a multiple value image corresponding to the number of hues separated by a plurality of predetermined thresholds.
In the HSV coordinate region shown in FIGS. 1A and 1B, a background, a character, and a character frame are indicated. That is, as shown in FIG. 1A, since the background and the character are each of an achromatic color, they come into an inner solid cylindrical region of the HSV coordinate system. Furthermore, since the character frame is of a chromatic color, it comes into an outer hollow cylindrical region. After the HSV conversion, these characteristics are used first to separate the background and the character, each being of an achromatic color, from the character frame of a chromatic color by a saturation threshold. Then the background and the character are separated from the separated achromatic color by a value threshold. Meanwhile, the chromatic color is identified as a multiple value image of a predetermined color number by the predetermined thresholds for separating the hue of the representative color used on a slip, as shown in FIG. 1B.
Hence, a image reading part IR for reading an original OR such as a colored slip and a color identification part CI are conventionally set as shown in FIG. 2. The image reading part IR comprises lenses L1, L2 and L3; a red color filter RF for passing a red light through; a green color filter GF for passing a green light through; a blue color filter BF for passing a blue light through; a photoelectric element of a picture element unit, such as charge coupled devices (CCDs) C1, C2 and C3 for detecting the red, green and blue light, respectively; amplifiers A1, A2 and A3; and A/D converters ADC1, ADC2 and ADC3 for converting an analog signal to a digital signal. The color identification part CI comprises an HSV conversion circuit CONV; comparison circuits COM1, COM2 and COM3 and multiplexers MUX1 and MUX2.
In FIG. 2, the lenses L1, L2 and L3; the red color filter RF; the green color filter GF; the blue color filter BF; the CCDs C1, C2 and C3; the amplifiers A1, A2 and A3; and the A/D converters ADC1, ADC2 and ADC3 of the image reading part IR, configure three optical signal systems each for a red signal, a green signal and a blue signal. These signals read the original OR in an image signal line unit for three colors, convert the image signal in the line to a digital signal in one picture element unit, and transmit the red, green and blue signals to the HSV conversion circuit CONV via signal lines R1, G1 and B1. The HSV conversion circuit CONV converts the red, green and blue signals to S, V and H signals and outputs them.
Next, the S signal is inputted to a conversion circuit COM1 and compared with a predetermined threshold T.sub.S. As shown in FIG. 1A, this threshold T.sub.S is for separating an achromatic color from a chromatic color. If the S signal is smaller than this threshold T.sub.S, it is recognized as a background or a character; if the S signal is equal to or greater than this threshold T.sub.S, it is recognized as a character frame. The comparison circuit COM1 outputs e.g. "0" for an achromatic color if S&lt;T.sub.S and "1" for a chromatic color if S.gtoreq.T.sub.S.
Meanwhile, the V signal is inputted to a conversion circuit COM2 and compared with a predetermined threshold T.sub.V. This threshold T.sub.V is for distinguishing the value of the background from that of the character of the achromatic color. If the V signal is smaller than this threshold T.sub.V, it is recognized as a black part, i.e. a character part; if the V signal is equal to or greater than this threshold T.sub.V, it is recognized as the background. The comparison circuit COM2 outputs e.g. "1" for a character if V&lt;T.sub.V and "0" for a background if V.gtoreq.T.sub.V. As described above, the achromatic color part of the output signal from this comparison circuit COM2 constitutes the background and the character. This part is selected by multiplexer MUX1 controlled by the output from the comparison circuit COM1 and is outputted "as is" if S&lt;T.sub.S. However, the chromatic color part constitutes the character frame. This part is outputted as "0", transmitted to the multiplexer MUX1 and eliminated. This is similar to it being read by using a dropout color if S&gt;=T.sub.S. An output signal S.sub.4 from the multiplexer MUX1 is transmitted to a character recognition part of its latter stage (not shown in the drawing).
The H signal is inputted to a conversion circuit COM3 and compared with a predetermined series of thresholds T.sub.H. As shown in FIG. 1B, this series of thresholds T.sub.H comprises T.sub.1 through T.sub.5, which distinguish red from yellow, yellow from green, green from cyan, cyan from blue, and blue from magenta, respectively. If the H signal is smaller than threshold T.sub.1, it is recognized as red; if the H signal is equal to or greater than this threshold T.sub.5, it is recognized as magenta. Accordingly, as inferred from FIG. 1B, if the character frame is of one color, namely, if a format is of one color, no distinction has to be made. However, if it is of two colors, e.g. when two kinds of forms need to be distinguished, one threshold is required. If it is of three colors, two thresholds are required. Similarly, if six colors need to be distinguished, five thresholds are required. The number and value of the threshold(s) depends on the number of colors used in the slip.
Unlike multiplexer MUX1, multiplexer MUX2 outputs "as is" the outputs i.e., character frame information, from the comparison circuit COM3, when "1" is transmitted from the comparison circuit COM1, and outputs "0" to multiplexer MUX2 when "0" is transmitted from the comparison circuit COM1, i.e., S&lt;T.sub.S, in case of an achromatic color.
When the image reading part IR actually reads the original OR in color, since a flat bed type scanner is used, a position difference of about two picture elements or two lines among three colors (e.g. red, green and blue) ordinarily arises. For instance, when the RGB gradations are almost equal, the picture element is of an achromatic color; but e.g. at a character edge, the color becomes a chromatic color because color balance is lost. However, a chromatic color cannot change to an achromatic color, although the color changes e.g. at a character edge. Hence, character frame information separated as a chromatic color by the saturation threshold according to the method shown in FIGS. 1A and 1B is contaminated by the noise of another chromatic color e.g. at a character edge, and has a problem of being hard to use "as is" as character cutout information. Also, due to the color blur, it is difficult to extract the form of a specific color.
That is, in the processing method of the present proposal, accompanying the movement of the original at a reading processing time by a flat bed scanner or the like, a difference of a picture element or a difference of a few picture elements arises in the main scanning line direction and the sub scanning line direction. This causes a loss of color balance at an edge part of an image, resulting in problems such as an edge part of an achromatic color being changed to a chromatic color or an edge part of a chromatic color being changed to another chromatic color. Hence, a character and a background are not properly separated, and character frames of different colors are not properly separated.
Therefore, this applicant disclosed a new method of processing color image data for eliminating a chromatic color occurring at an achromatic color edge part in color image data, in the previously filed Japanese patent application No. 1988-275405.
The present invention is made based on the above background.