The present invention relates to an efficient image data encoding system used in a digital transmission of TV telephone and TV conference data by an NTSC method, or in a digital transmission of high-definition TV (HDTV) data, and more specifically to an image encoding and transmitting system used in an image transmission system for dividing input frame image data into a plurality of blocks comprising a plurality of picture elements, encoding them in block units, and transmitting the encoded data in cell units.
An asynchronous transfer mode (ATM) has been widely accepted and is being standardized to realize a broadband ISDN, a new generation network. In the ATM, data are put into a single format "cell" (a fixed length packet) and transmitted by a label multiplexing technology. In the image communication field, it is as an efficient improvement in the utilization of lines and regenerated image quality through a variable bit rate transmission. However, in an efficient encoding system based on an inter-frame encoding process, the quality of images are badly deteriorated if a cell-discarding operation makes an error in image data during transmission. Therefore, an efficient encoding system having a cell-discard compensation function is earnestly required.
Recently, various image data processors have been digital oriented, and new systems for transmitting digital image data signals have been extensively developed in many application fields such as TV telephones, etc. Generally, image data involves a large volume of information when transmitted. For example, when TV data are digitized, they require 100 Mb/s transmission capacity and involve 1,500 times as much as voice data. In a TV broadcast, a screen is switched 30 times in a second, and a frame is switched every 1/30 second. Normally, picture elements are arranged in a two-dimensional array in one screen corresponding to one frame. Therefore, there has to be a large volume of information for image data of one frame even though the density of one picture element can be represented by 8 bits.
However, the contents of an image in each frame may not vary greatly even when a frame is switched every 1/30 second. That is, a large part of an image may be static, for instance a sky scene or a scene in a background. Only a small part of it varies in each frame. Therefore, an inter-frame encoding system has been developed to compress and encode image data.
In the inter-frame encoding system, for example, the difference in density data is obtained between corresponding image pictures in two time-series frames. The difference is encoded by a Huffman code, for example, and transmitted to the transmission line. If the density does not vary for most picture elements, the difference between picture elements results in "0" and can be represented by about 2 bits. A large difference between picture elements requires 8 or more bits. However, since only large differences must be encoded, total transmission data can be greatly compressed. By contrast, a system of encoding data in 1 frame "as is" is referred to as an "intra-frame encoding system."
In the inter- and intra-frame encoding systems, two basic operations "sampling" and "quantization" are required for representing image signals digitally. There are two methods of sampling an image. First, an image is sampled by representing by density values for a discrete point array corresponding to picture elements. Second, a density value function (image function) defined on the XY plane corresponding to a frame plane is expanded orthogonally, and the expansion coefficient is determined to be a sampling value.
FIG. 1 is a block diagram for explaining the general configuration of the image transmission system using an orthogonal transform. In FIG. 1, noises are removed through a filter from inputted image data by a preprocessor 1, and picture element data are divided in block units. The block-division does not mean an orthogonal transform collectively performed on data comprising K.times.N picture elements in a two-dimensional array in one frame, but is performed to shorten the transformation time by obtaining a transformation coefficient for each block comprising 4.times.4 to 16.times.16 picture elements.
In a source encoder 2, data are quantized for the reduction of bits after the inter-frame encoding, the transformation-encoding, etc. as encoding processes for removing redundancy in space and time directions. The orthogonal transforms performed in this technology are, for example, Adamar transform, cosine transform, Karounen Roebe transform, etc. Among them, the discrete representation in cosine transforms is commonly used, and the discrete cosine transform (DCT) is applied most.
In such orthogonal transformation-encoding systems, the inter-relation in a small area is regarded and an orthogonal transform is performed using picture elements of a small area as a numerical string. The resultant transformation coefficient corresponds to a frequency component indicating each level ranging from a low frequency to a high frequency. Since an image signal normally predominantly has a low frequency component rather than a high frequency component, a larger number of bits are assigned to a low frequency component and a smaller number of bits to a high frequency component prior to the quantization. Thus, the volume of transmission data can be reduced greatly.
FIG. 1 shows a variable length encoder 3 for encoding data by a Huffman code, run-length code, etc. to eliminate statistical redundancy, and a cell assembler 4 for assigning side information for detecting cell-discard and a header to a cell. Thus, an assembled cell is outputted through a network.
On the receiving side, a cell disassembler 5 detects cell-discard in a cell inputted through the network, and the cell is disassembled to data. Then, the received bit string is decoded by a variable length decoder 6, and image data are decoded by a source decoder 7 by performing the source encoding process in reverse. Then, a post-processor 8 performs an unblock process a noise removal process using a filter, etc., and the data are outputted.
FIG. 2 is a block diagram for explaining the configuration for the inter-frame DCT encoding system as the first example of a typical conventional encoding system. In FIG. 2, a subtracter 11 on the sending side calculates the difference between inputted image data divided into block data and the data obtained by multiplying the block data stored in a frame memory 9 at the same block position but in the preceding frame by the leak coefficient .alpha. provided by a leak coefficient unit 10. The difference and the inputted data are applied to an intra-frame/inter-frame determiner 12 for determining which is effective, to encode the data in the present frame, that is, to encode the intra-frame data as is, or to encode the difference between the data in the present frame and those at the corresponding position in one frame before the present data. The data determined to be more effective are applied to a discrete cosine transform (DCT) unit 13, and quantized by a quantizer 14.
The output of the quantizer 14 and the intra-frame/inter-frame determination result indicated by the broken line shown in FIG. 2 are outputted to the variable length encoder 3 shown in FIG. 1. Simultaneously, the output of the quantizer 14 is inversely transformed again to the picture element domain by an inverse discrete cosine transform (IDCT) unit 15, added by an adder 16 to the product obtained by multiplying the data stored in the frame memory 9 by .alpha. from the leak coefficient unit 10, and then stored again in the frame memory 9. However, when an intra-frame data encoding is selected, the adder 16 does not add the data in the prior frame according to the input of a determination signal as shown by the broken line, but stores the output of the IDCT unit 15 as is in the frame memory 9.
In FIG. 2, on the receiving side, the input data are inversely transformed to the picture element domain by an IDCT unit 17. The result is added by an adder 20 to the data obtained by multiplying the data in the preceding frame stored in a frame memory 18 by # from a leak coefficient unit 19, and then outputted to the post-processor 8 shown in FIG. 1 and stored again in the frame memory 18. When an intra-frame data encoding is selected, the data in the prior frame are not added. For a block which does not receive data due to cell- discard, the prior frame data is added to the input 0, and the result is stored in the frame memory 18. That is, at the cell-discard, the prior frame data are stored as is to the frame memory 18.
In FIG. 2, both on the sending and the receiving sides, the leak coefficient unit exists between the output of the frame memory and the adder. The product obtained by multiplying the output of the frame memory by the leak coefficient .alpha. is used as a value of the data in the preceding frame. If the leak coefficient .alpha. is set to a value between 0 and 1 and an error E arises between the contents stored in the frame memories on both the sending and the receiving sides, the error results in Ex.alpha..sup.n after n frames. The error gradually gets smaller with time, and finally converges to zero. Therefore, the quality of a regenerated image can be restored when the error value has decreased down to near zero even though an error E has arisen on the sending and the receiving sides due to cell-discard and has incurred the deterioration in the quality of a regenerated image.
FIG. 3 is a block diagram for explaining the configuration for realizing the DCT inter-frame encoding system as the second embodiment of the conventional encoding system. The configuration of this system is similar to the system shown in FIG. 2 except the following points: the discrete cosine transform (DCT) unit 13 is located in the source encoder 2, the input image data are transformed by the DCT unit 13 to a frequency domain coefficient, and the inter-frame or intra-frame data encoding system is selected for the transformed coefficient data.
On the receiving side, data in the frequency domain are decoded using the frame memory 18 depending on the selection of the inter-frame or intra-frame encoding system. The result is transformed by the IDCT unit 17 to the picture element domain, and outputted to the post-processor 8.
FIG. 5 is a configurational block diagram for explaining a second example of the motion compensation inter-frame DCT encoding system of the prior art technology. It is similar in its configuration to FIG. 2 showing the first example of the prior art technology. However, since the intra-frame encoding system is not used in the operation shown in FIG. 5, the intra-frame/inter-frame determiner 12 is not provided on the sending side. Instead, a motion vector detector 100 for detecting a motion vector of each block according to inputted data and the data in the preceding frame stored in the frame memory 9 and a variable delay unit 101 for delaying the contents stored in the frame memory 9 and outputting them to the leak coefficient unit 10 are added to the sending side.
Also on the receiving side, a variable delay unit 102 for delaying by the necessary time period the data in the preceding frame stored in the frame memory 18 according to a motion vector received from the sending side and outputting them to the leak coefficient unit 19 is added.
In the conventional encoding system shown in FIGS. 2 and 3, the image quality can be automatically restored with time using a leak coefficient .alpha. even though the quality of an image has deteriorated due to cell-discard. For a block in which the time difference is comparatively small between the previous frame and the present frame, an error in the image data between the sender and the receiver is small, and can rapidly converge to zero using the leak coefficient. Therefore, the deterioration of the image quality can be limited to a visually permissible level. However, for a block in which the time difference is comparatively large, the error between these image data becomes large and the deterioration in the image quality is badly outstanding when the cell-discard rate is high because the error takes considerable time in converging using a leak coefficient.