Conventionally, still image data is often compressed by a method using discrete cosine transform or a method using Wavelet transform. Encoding of this type is variable-length encoding, and hence the encoded data amount changes for each image to be encoded.
According to JPEG encoding as an international standardization scheme, only one quantization matrix can be defined for an image, and it is difficult to make encoded data of one image (document) fall within a target encoded data amount without pre-scan. When JPEG encoding is used in a system which stores data in a limited memory, a memory overflow may occur.
In order to prevent this, there is a method of re-reading the same document upon changing the compression ratio parameter when the actual encoded data amount exceeds an expected encoded data amount. There is also proposed a method of estimating an encoded data amount in advance by pre-scan and re-setting quantization parameters to adjust the encoded data amount.
As described above, pre-scan and actual scan are generally executed, but a document must be read at least two times at poor efficiency. Especially when a copying machine encodes a document of a plurality of sheets (pages) while successively reading it page by page by an ADF (Auto Document Feeder), it is impossible in terms of the process time to read the same document twice.
There is known a technique of eliminating these two, pre-scan and actual scan operations, and encoding one entire image using a common encoding parameter to compress the encoded data into a target encoded data amount (Japanese Patent Laid-Open No. 2003-8903 (corresponding US Pre-Grant Publication No. AA2003002743); to be referred to as reference 1 hereinafter). According to this technique, encoded data are sequentially stored in two memories during one image input operation (for one page). When the amount of encoded data in a predetermined memory exceeds a predetermined size during this operation, the data in the predetermined memory is discarded, and the current encoding parameter is updated to a new encoding parameter for increasing the compression ratio. At the updated encoding parameter, encoding of image data of an unencoded part continues (encoded data obtained at this time is defined as the first encoded data). At this time, encoded data obtained by encoding before the compression ratio is increased are stored in the other memory. The encoded data are re-encoded in accordance with the updated parameter, attaining encoded data identical to those obtained by encoding data at the new parameter from the beginning (encoded data obtained by re-encoding is defined as the second encoded data). The first and second encoded data are concatenated. As a result, data (complying with JPEG encoding) which is encoded at a common encoding parameter (updated encoding parameter) for one entire image (of one page) can be obtained. In addition, the encoded data amount can be suppressed to a target encoded data amount.
Unlike the technique of executing re-encoding in accordance with the encoded data amount in the process of encoding, as described in reference 1, there is also known a technique of keeping an entire image at a predetermined encoded data amount while selectively applying either of lossless encoding and lossy encoding to a plurality of areas in an image (e.g., Japanese Patent Laid-Open No. 10-224640 (corresponding U.S. Pat. No. 6,067,382)).
Compression encoding in reference 1 adopts only a lossy compression technique such as JPEG.
According to the encoded data amount control technique in reference 1, when an encoded data amount generated during encoding of a 1-page image exceeds a predetermined size, a process equivalent to uniquely increasing the compression ratio for the entire page is executed. This may partially degrade the image quality more than expected. This problem becomes more serious in compressing an image containing a character/line image.
There is known a lossless encoding technique “JPEG-LS”. “JPEG” is prefixed to this technique, but its encoding algorithm is completely different from general lossy JPEG. JPEG-LS is known to be lower in compression ratio for natural images than JPEG, but be able to lossless encode character/line images and computer graphics at a higher compression ratio.
Considering this, when a document image containing both photographic and character images in one page is to be compressed, lossless compression should be applied to a character/line image part as much as possible.