1. Field of the Invention
The present invention relates to a method, a device and a computer readable medium storing a program for processing input data concerning generation of an electronic file.
2. Description of the Prior Art
Recently, MFPs (Multi Function Peripherals) have moved from types of monochrome image processing to types capable of color image processing (e.g., a color MFP).
Such an MFP capable of color image processing usually has a function of attaching image data of an original read (scanned) by a scanner to electronic mail and transmitting the electronic mail directly therefrom.
However, if an A4 original is scanned as a 300 dpi full color image data, the quantity of data becomes approximately 25 MB, which is not suitable for transmission by electronic mail.
Therefore, the image data of the scanned original are usually compressed before being transmitted. However, lossy compression becomes necessary in order to compress the image data at high compression ratio to have a quantity of data suitable for transmission by electronic mail. If the compression is performed at high compression ratio so as to reduce the quantity of data, then a part of character in the image may be too blurred to read. Although it is necessary to reduce the compression ratio in order to avoid this, then it is difficult to reduce the quantity of data sufficiently.
Therefore, a function named a high compression PDF (compact PDF) is used conventionally. In this function, instead of using the same compression method for the whole image data, different compression methods are used for different areas to be processed. Thus, it is possible to generate a PDF file having a small quantity of data (file size) while legibility of characters is secured.
Specifically, each area (object) is extracted from image data of an original scanned by an MFP or the like by word or line in accordance with a predetermined rule, and it is determined whether the area is a character area (character object) including characters or a non-character area (non-character object) including no character.
As to every character object, one typical color about a color or the like of characters included in the character object is set. Then, character objects having similar typical colors are integrated into one character object. Alternatively, character objects are integrated based on a distance between them or an increasing degree of unnecessary blank pixels that will be generated when the character objects are integrated.
Further, image data of a part corresponding to the character object are digitized to be binary data (binarized) with maintenance of a high resolution. As to the non-character object, the resolution thereof is reduced without making binary data for maintaining its gradation property. Then, it is compressed with a high compression ratio.
In addition, there is another method for reducing a file size as described below.
A color space is divided, in advance, into a predetermined number of blocks, and each character object is assigned to a block having a similar color. Thus, the number of objects is limited within the predetermined number so that an increase of the file size can be suppressed.
In addition, it is also possible to reduce the file size by reducing the number of colors that are used in the image data using conventional methods described below.
According to the first method described in Japanese unexamined patent publication No. 10-74248, colors are reduced by using a look-up table storing color conversion information so as to assign a color of each pixel of an input image to one of limited colors.
According to the second method described in Japanese unexamined patent publication No. 6-175633, a ratio of each color from R, G and B color data of input image data of multi colors is calculated so as to divide them into groups corresponding to the desired number of colors and to determine a typical color. A color area having a plurality of typical colors is regarded as an unknown area, and it is estimated whether the unknown area is a pattern or not from a difference between a color of an edge or a center portion of the unknown area and a color of a peripheral portion or the like. If it is a pattern, it is decided to be a different color from the periphery portion. If it is not a pattern, it is decided to be the same color as the periphery portion.
As described above, the conventional method integrates objects in accordance with a typical color of the object, a distance of the same, and an increasing degree of unnecessary blank pixels that will be generated due to the integration. Then, objects that have remained after that are used for generating a file. In this method, the number of objects may not be reduced sufficiently because the number of objects that have remained after the integration depends on a type of the original or the like. In this case, the file size may be increased.
In addition, the conventional method compares an attribution such as the typical color of every object with that of each of other objects when the objects are integrated. Therefore, if the number of the objects is huge, the process for the comparison may take a lot of time.
In addition, the method of compressing the non-character object with high compression ratio while maintaining a high resolution of image data corresponding to the character object should extract a part of characters from image data of the scanned original image. In other words, an area including characters must be determined to be the character area. If it is determined incorrectly, image data of the area is compressed at high compression ratio after the resolution thereof is reduced, so characters in the area may be unreadable. In fact, there is a possibility that the characters cannot be read because distinguishing a character area and a non-character area is difficult.
According to the method of dividing the color space into a predetermined number of blocks in advance, in an object including a character having a color on a boundary between blocks or a vicinity thereof, the color of the character is divided into one of the blocks and the other, so unevenness of colors may be generated.
According to the first method described above, the process of reducing colors is realized by assigning the limited color to each pixel. In this method too, unevenness of colors may be generated at a boundary between a group of pixels to which a certain limited color is assigned and another group of pixels to which another limited color is assigned or vicinity thereof. Even if the second method described above is used, unevenness of colors may be generated for the same reason.