Conventionally, still image data is often compressed by a method using discrete cosine transform or a method using Wavelet transform. Encoding of this type is variable-length encoding, and hence the code amount changes for each image to be encoded.
According to JPEG encoding as an internal standardization scheme, only one quantization matrix can be defined for an image, and it is difficult to make encoded data of one image (document) fall within a target code amount without prescan. When JPEG encoding is used in a system which stores data in a limited memory, a memory overflow may occur.
In order to prevent this, conventional schemes used, for example, a method of re-reading the same document upon changing the compression ratio parameter when the actual code amount exceeds an expected code amount, or a method of estimating a code amount in advance by prescan and re-setting quantization parameters to adjust the code amount.
As described above, prescan and actual scan are generally executed, but a document must be read at least two times at poor efficiency. Especially when a copying machine encodes a document of a plurality of sheets (pages) while successively reading it page by page by an ADF (Auto Document Feeder), it is impossible in terms of the process time to read the same document twice.
The assignee of the present application has proposed a technique of eliminating these two, prescan and actual scan operations, and encoding one entire image using a common encoding parameter to compress the encoded data into a target encoded data amount (e.g., Japanese Patent Laid-Open No. 2003-8903). According to this technique, encoded data are sequentially stored in two memories during one image input operation (for one page). When the amount of encoded data in a predetermined memory exceeds a predetermined size during this operation, the data in the predetermined memory is discarded, the current encoding parameter is updated to a new encoding parameter for increasing the compression ratio, and encoding of image data of an unencoded part continues (encoded data obtained at this time is defined as the first encoded data). At this time, encoded data obtained by encoding before the compression ratio is increased are stored in the other memory, and the encoded data are re-encoded in accordance with the updated parameter. As a result, encoded data identical to those obtained by encoding data at the new parameter from the beginning can be attained (encoded data obtained by re-encoding is defined as the second encoded data). The first and second encoded data are concatenated to obtain data (complying with JPEG encoding) which is encoded at a common encoding parameter (updated encoding parameter) for one entire image (of one page). In addition, the encoded data amount can be suppressed to a target encoded data amount.
Compression encoding in the conventional technique adopts only a lossy compression technique such as JPEG.
According to the code amount control technique, when an encoded data amount generated during encoding of a 1-page image exceeds a predetermined size, a process equivalent to uniquely increasing the compression ratio for the entire page is executed, and the image quality may partially degrade more than expected. This problem becomes more serious in compressing an image containing a character-line image.
There is known a lossless encoding technique “JPEG-LS”. “JPEG” is prefixed to this technique, but its encoding algorithm is completely different from general lossy JPEG. JPEG-LS is known to be lower in compression ratio for natural images than JPEG, but be able to losslessly encode character-line images and computer graphics at a higher compression ratio.
Considering this, when a document image containing both photographic and character-line images in one page is to be compressed, lossless compression should be applied to a character-line image part as much as possible.