1. Field of the Invention
The present invention relates to a color facsimile machine using JPEG based on ITU-T standards as its encoding system as well as relating to a contents distribution service having a color facsimile transmission function.
2. Description of the Prior Art
In general, facsimile transmission has the advantage of urgently and rapidly transmitting and providing necessary information and in order to make this advantage further useful, sender information that indicates the information source and when the information was provided is attached to the sending original.
A facsimile machine using a sheet feed scanner and having no memory for storing the whole image implements data transmission whilst reading the original and creating its image data. Therefore, the sender information is added before encoding then the image data and the sender information are encoded integrally and sent.
There are some cases where image data should once be stored in the memory and the stored data should be sent some time later, such as cases for predetermined time transmission, re-dialing transmission, contents distribution service and the like. Transmission of this type will be called ‘memory transmission’. There are four ways of memory transmission as follows:
(Method 1)
This method is implemented by first scanning the image data, storing it directly into the memory without coding(compressing), and adding the sender information to the image data when memory transmission is actually performed, then integrally encoding the image data with the sender information and transmitting it.
(Method 2)
This method is implemented by scanning the image data, encoding it beforehand and storing the coded data into the memory, and decoding the coded data stored in the memory to restore the original data when memory transmission is actually started, then adding the sender information at the head of the restored data and again encoding the integrated data and transmitting it.
(Method 3)
This method is implemented by scanning the image data, adding the sender information to the scanned data immediately after scanning on the transmission side, then compressing the total data into the memory and transmitting it.
(Method 4)
This method is implemented by scanning image data, encoding it beforehand and storing the coded data into the memory, and then also encoding the sender information when memory transmission is actually started, then merging the two sets of coded data and transmitting it.
However, the above methods 1, 2 and 3 have the problems as follows:
The problem with Method 1: this method needs a high capacity memory for temporarily storing uncompressed image data.
The problem with Method 2: this method takes long time for processing the coding and decoding.
The problem with Method 3: the time of transmission, included in the sender information attached at the time of image scanning represents a past time, so that it is impossible to transmit the information of the exact time of transmission.
On the contrary, merging the coded data of the scanned image and coded data of the sender information according to Method 4 can avoid the above three problems. However, the image data and the sender information are coded independently, so that it is not possible to encode the image to be transmitted second making the best use of the correlation with the image to be transmitted first. Therefore, there are some limitations in encoding based on Method 4.
In connection with the above, mentioned as coding methods for monochrome transmission can be MH(Modified Huffman), MR(Modified Read), MMR(Modified Modified Read) and JBIG(Joint Bi-level Image Group).
MH is a coding scheme performed by coding run-lengths of ‘white’ runs and ‘black’ runs for each line by Huffman codes and adding a line synchronizing signal EOL at the end of every one line of codes.
MR is an improved MH coding, in which data is coded by using the correlation with the previous line in order to enhance the compressibility. That is, the first line is coded based on MH, the data from the second to K-th line is coded making use of the correlation with previous lines. Then, the data on the (K+1)-th line is coded by MH, and this cycle is repeated. This number ‘K’ is called the ‘K-parameter’. The MR coding also uses the line synchronizing signal EOL.
Since MH uses independent data for each line and MR uses an independent line of data for every K lines, part of the image can be encoded and decoded independently. Because these schemes are originally assumed to be used for the lines without error correction, if a transmission error occurs, the decoded image only presents partial irregularities not affecting the total image.
In contrast, because MMR is an encoding scheme having an infinite K (K=∞) and JBIG is a Markov model coding scheme, these schemes need reference pixels for encoding. Accordingly, it is impossible for MMR and JBIG to code and decode part of an image independently. Therefore, in these coding schemes, the error correction mode(ECM) is essential.
As understood from the above description, of encoding schemes of monochrome transmission only MH and MR can realize memory transmission with sender information attached based on Method 4.
In the field of color facsimile technologies, JPEG(Photographic Picture Experts Group) has been adopted as a standard coding scheme. While for color facsimile, the coding scheme is the same as that of monochrome facsimile in terms of recommendations, hence JPEG is also one of coding scheme options. In JPEG, the DC component and AC components are coded in the order mentioned. Since the DC component of a pixel block has a strong correlation with that of the previous pixel block, the difference is used to encode. For countermeasures against transmission failure due to image disturbance resulting from transmission error and for allowing random access, restart markers for initializing the DC component are provided and can be used.
Also in JPEG, if data coding is effected by inserting a restart marker immediately after the right end block of the image, the image can be separated vertically into two parts, so that it becomes possible to merge sets of coded data in the same manner as in the MH and MR schemes for monochrome.
As understood from the above, attachment of sender information upon memory transmission in color facsimile may be performed based on a JPEG scheme using restart markers. This has been disclosed in Japanese Patent Application Laid-Open Hei 11 No.313210.
Attachment of sender information at the leading side of the image using the restart marker, however, presents the problem of lowering of the encoding efficiency. In order to clarify the reason of this problem occurring, the JPEG algorithm and JPEG data format will be described first.
A color image is composed of three components, each having tones, so that the amount of data is bulky compared to that of a monochrome binary image of the same size. For example, a color image of 256 tones is composed of a data amount 24(=3×8) times as large as that of a monochrome binary image. Therefore, an efficient compression scheme is desired for transmission.
JPEG which is most prevalent as a compression method for natural color images is used in many applications such as for digital cameras, personal computers, the internet, etc.
In the field of color facsimile, JPEG is used as the standard coding scheme though the CIELAB color space is used while other applications use the YCbCr color space which permits linear transformation from the RGB space.
Further, as to facsimile transmission there are cases where the number of lines or line count of the input image is unknown beforehand because the original is input through a sheet feed type scanner. Therefore, it is approved that the information of the line count may be put at the end of the compressed data.
JPEG is short for the Group ‘Joint Photographic Experts Group’, for jointly producing standards for coding still images, working on both ISO and CCITT(now ITU-T) standards. At present, however, it mainly indicates the coding scheme and coded file format defined by this group.
There are a number of JPEG algorithms. The one that is adopted for color facsimile is the same as that used in many other applications and is called the JPEG baseline algorithm.
FIG. 1 is a schematic diagram showing the coding and decoding scheme based on the JPEG baseline algorithm. Referring to next to FIG. 1, the JPEG baseline algorithm of coding and decoding and the functions of the blocks will be briefly described.
On the coding side 600, the original image data(CIELAB) 601 obtained from an original image after color transformation is subjected to subsampling at a subsampling portion 602. Then, each 8×8 block for luminance and chrominance is transformed by the DCT at a DCT portion 603. Then, the DCT coefficients determined by the DCT portion 603 are quantized at a quantizer 604. Huffman codes are assigned to the quantized DC component and AC components at a Huffman encoder 605. Other than the compressed data, the JPEG data includes parameters required for decoding such as information etc., for creating a quantization table T1 and a Huffman table T2. Therefore, a control code adder 606 is provided to add the parameters of quantization table T1 and Huffman table T2 having been used during data compression in quantizer 604 and Huffman encoder 605, whereby JPEG data 607 to be transmitted is produced.
The data processing on the decoding side is basically performed by reverse operations of the coding. First, at a control code adder 606b, necessary parameters such as quantization table T1, Huffman table T2 and the like are extracted from JPEG data 607 so as to make a preparation for decoding of the compressed data. Then, the compressed data is subjected to the Huffman decoding, inverse quantization, inverse DCT and interpolation, on the basis of the parameters, through a Huffman decoder 605b, inverse quantizer 604b, inverse DCT portion 603b and interpolation portion 608 in the order mentioned, whereby a decoded image 601b is obtained.
It should be noted that in JPEG, since some information losses will take place through quantization, the decoded image does not completely agree with the original image. This kind of process is called irreversible coding.
Once the data based on JPEG is damaged, it is impossible to restore the data. Therefore, ECM is essential for facsimile transmission and reception.
The reason subsampling is performed at subsampling portion 602 is that the human eye is insensitive to spatial variations in chrominance compared to spatial variations in luminance. That is, only the resolution as to the chrominance is reduced by quality reduction while the resolution as to the luminance is left as is. Not only this process makes the data be compressed but also provides an advantage of reducing the amount of operations because of reduction in number of the blocks to be subjected to the DCT process as described below.
In color facsimile, subsampling with a ratio of 4:1:1 of the chrominance data a*, b* is basically performed by taking the average of four pixels to reduce their vertical and horizontal resolutions to half. Therefore, four blocks of luminance data L* correspond to one block of luminance data a* and one block of luminance data b*.
The DCT at DCT portion 603 is a kind of orthogonal transform. As shown in FIG. 2, output from one 8×8 block P(x,y) for one component of an original image is one 8×8 DCT coefficient block F(u,v). The value at the upper left in a DCT coefficient block F(u,v), i.e., F(0,0) indicates its DC component and other values in the block represent AC components.
The specific equation of the transformation in eight bit mode in the JPEG baseline algorithm is expressed as follows:
                                          f            ⁡                          (                              x                ,                y                            )                                =                                    P              ⁡                              (                                  x                  ,                  y                                )                                      -            128                          ⁢                                  ⁢                              F            ⁡                          (                              u                ,                v                            )                                =                                    1              4                        ⁢                          {                                                C                  ⁡                                      (                    u                    )                                                  ⁢                                  C                  ⁡                                      (                    v                    )                                                              }                        ⁢                                          ∑                                  x                  =                  0                                7                            ⁢                                                ∑                                      y                    =                    0                                    7                                ⁢                                  [                                                            f                      ⁡                                              (                                                  x                          ,                          y                                                )                                                              ⁢                    cos                    ⁢                                                                                  ⁢                                          {                                                                                                    (                                                                                          2                                ⁢                                x                                                            +                              1                                                        )                                                    ⁢                          u                          ⁢                                                                                                          ⁢                          π                                                16                                            }                                        ⁢                                                                                  ⁢                    cos                    ⁢                                                                                  ⁢                                          {                                                                                                    (                                                                                          2                                ⁢                                x                                                            +                              1                                                        )                                                    ⁢                          v                          ⁢                                                                                                          ⁢                          π                                                16                                            }                                                        ]                                                                    ⁢                                  ⁢                  (                      u            ,            v            ,            x            ,                          y              =                              0                ⁢                                                                  ⁢                to                ⁢                                                                  ⁢                7                                              )                ⁢                                  ⁢                              c            ⁡                          (              0              )                                =                      1                          2                                      ⁢                                  ⁢                              c            ⁡                          (              n              )                                =                      1            ⁢                                                  ⁢                          (                              n                ≠                0                            )                                                          (Formula  1)            
The quantization at quantizer 604 is expressed as the following formula:G(u,v)=[F(u,v)/Q(u,v)](u, v=0 to 7)where F(u,v) represents the DCT coefficients before quantization, Q(u,v) represents quantization table T1 and G(u,v) represents the coefficients after quantization, and [ ] denotes rounding.
A different quantization table Q(u,v) may be used for each of the color components L*, a* and b*, but generally, for most cases, one table is used for luminance and another one for chrominance. Usually, for high frequency components, the values in the table are made large so as to roughen the quantization. This can be justified because the human visual sensitivity becomes lower for the higher frequency components. That is, if the information of the higher frequency components is roughly quantized, image degradation is hardly perceived. Nevertheless, since the image appearance depends on the image size, the viewpoint distance, the resolution and other factors, the optimal tables differ from one another depending upon the application used.
Upon code assignment of the DCT coefficients after quantization at Huffman encoder 605, since the DC component has a strong correlation with that of the previous block, a Huffman code is assigned to its difference from that of the previous block, as shown in FIG. 3.
For the AC components, the values are detected by diagonally traversing scan in the zigzag order as shown in FIG. 4, coding is performed in the following steps shown in the flowchart in FIG. 5.    (1) Scan the AC components in the zigzag order as shown in FIG. 4 (Step S1)    (2) If the observed component is not equal to zero, perform grouping (Steps S2 and S3)    (3) If the observed component is equal to zero, count the run-length (Steps S2 and S4)    (4) If all the components scanned in the zigzag order are zero until the end, stop the operation.
Since the higher frequency AC components after quantization are usually divided by greater values, most of the values will converge to zero. Therefore, the amount of codes can be markedly reduced by the procedures (3) and (4).
Since the optimal Huffman coding differs depending upon the image and quantization table T1 used, the Huffman table T2 is allowed to be selected in JPEG (Step S5).
Now that the JPEG algorithm has been described, in order to positively decode the coded data under different circumstances, it is necessary to send various parameters in the common format in addition to the compressed data thus produced based on the coding algorithm. This is why the exchange format of coded data is specified in the JPEG standard. Next, the data structure of the exchange format of JPEG data will be described with reference to FIG. 6.
It is assumed for convenience sake that JPEG data is roughly divided into three parts, namely leading marker code portion TM, image information portion II and the end marker code portion EM.
Leading marker code portion TM necessarily starts with a SOI area, followed by an area for marker group M including various parameters for decoding.
The marker group M includes marker segments APP1, COM, DHT, DQT, SOF0, DRI and SOS.
The marker segment APP1 is introduced for color facsimile and contains a facsimile identifier and resolution information.
The marker segment COM holds comments such as a product name etc., having no effect on compression.
The marker segment DHT includes the information for generating Huffman table T2 and the marker segment DQT includes quantization table T1.
The marker segment SOF0 is the frame header for JPEG baseline and includes the line count in the image and the image width.
The marker segment SOS should be located immediately before compression data while the marker segments SOF0, DHT and DQT may be positioned at any place between the area SOI and marker segment SOS. As to marker segments DHT and DQT, all tables may be included in a single marker segment, or a multiple number of marker segments may be used each including one table only. The information as to assignment of the tables to different color components is included in the marker segment SOS.
The marker segment DRI stores the value of a restart interval which designates the interval between after mentioned restart markers RM. These will be described later.
The end marker code portion EM(FIG. 6) is comprised of marker segments DNL and EOI.
The marker segment DNL holds an NL parameter which designates the line count.
The DNL is provided assuming a case where the encoding side has no memory for storing the entire image and cannot determine the line count at the time of coding. That is, this marker segment is used to designate the line count in the image after compression. In this case, the line count stored in the marker segment SOF0 is a dummy line count, and the NL parameter of marker segment DNL indicates the true line count. Here, the relation: the line count Y>NL or Y=0 should hold.
For color facsimile data, JPEG data with the line count designated by marker segment DNL must be decoded. Use of the marker segment DNL is defined in the JPEG standard but is not found in other applications but being unique to facsimile data.
The image information portion II is comprised of compressed data portions CD and restart markers RM as shown in FIG. 6. In the JPEG algorithm, one unit of one color component is encoded then another unit of a next color component is encoded. The cyclic unit is called a MCU. For the case of the 4:1:1 subsampling, four blocks of the luminance component and one block for each of the two components of chrominance form one MCU.
It is ruled in the JPEG standard that if restart markers RM are used, they must be interposed at every boundary between MCUs. This interval is called the restart interval and is designated by the marker segment DRI(FIG. 7) in the leading marker code portion TM.
For color facsimile communication, while the JPEG algorithm described heretofore is adopted as the standard coding scheme, use of restart markers RM upon attachment of sender information after encoding in memory transmission has the following problems.
Since the DC component value corresponds to the total value of the pixels in the block, it varies depending on the location in the image but the differential values between DC components inherently converge to around zero. Therefore, it is possible to improve the encoding efficiency by allotting short codes to small differential values.
However, code assignment immediately after restart markers RM is made not to the differential values between DC components but to the DC components themselves. Therefore, the more the restart markers RM are, the more the encoding efficiency decreases.
Text for the sender information can be represented with about 32 lines at 200 dpi. Since one block has 8×8 pixels, 1 MCU is made up of 16×16 pixels in a 4:1:1 subsampling configuration. The sender information attached to an A4 sized image having a width of 1728 pixels is constituted by 216(1728/16*32/16) MCUs. Therefore, to add the sender information after coding, the interval at which restart markers RM are inserted should be set at 216 MCUs.
However, the interval between restart markers RM cannot be varied according to the JPEG standard. Therefore, if the sender information is added at the leading side of the image, a number of restart markers RM must be inserted within the image data at intervals of the same distance, even though all of them but one, which should be inserted at the connection where the sender information and the image are merged, are not actually needed.
As a result, compressed data CD is forced to be divided by restart markers into rectangular blocks of 32 lines the way the sender information 4 is defined. FIG. 8 shows the relationship between the JPEG data and the image. FIG. 8(a) is an example image of an original image 1 added with a sender information 4 arranged at its leading end. FIG. 8(b) is a schematic diagram showing the relationship of JPEG data with blocks of image divided by the lines in the image in (a). A reference numeral C1 designates a piece of compressed data corresponding to one block in original image 1 and C4 designates the compressed data of sender information 4.
Since restart marker RM initializes the DC component, correlation between DC components can be used less as the restart markers increase in number, posing the problem of the encoding efficiency being lowered. Since an A4 sized image is about 2300 lines in height, about 71 (=2300/32) restart markers are inserted in a page of image, resulting a marked degradation of encoding efficiency.