(1) Field of the Invention
The present invention relates to a picture data coding apparatus for coding and decoding motion picture data which are used in video conference, video phones, and the like.
(2) Description of the Related Art
Various arts for processing digitalized motion picture data have been realized these days. Generally, picture data are transmitted in a large amount so that it is necessary to compress or reduce them in volume in order to make them available for record or transmission.
ITU (International Telecommunication Union) Recommendation H.261 Video Codec for Audiovisual Services is known as a picture data compression coding method for transmitting motion picture data at a low-bit rate of between 64Kbps and 2Mbps.
According to the method, a frame containing picture data to be coded (hereinafter a target frame) is divided into several blocks. Then, one block in a target frame is compared with an area corresponding to one block in another frame, which precedes the target frame, and as a result, a motion vector between them is detected. A motion vector indicates a one block area in the preceding frame, which block area has the closest correlation with a block to be coded in the target frame (hereinafter a target block), so that not only the prediction error between the target block and the corresponding one block area in the preceding frame, but the motion vector are coded and transmitted.
FIG. 1 shows the construction of a conventional picture data coding apparatus which is based on the picture data compression coding method H. 261. The apparatus has a pre-process unit 15, a coding unit 6, a transmission control unit 7, and an unillustrated decoding unit.
The pre-process unit 15 is provided with a YC separation and A/D conversion unit 12, an NTSC/CIF (Common Intermediate Format) conversion unit 13, and a pre-process filter 14. The YC separation and A/D conversion unit 12 divides an NTSC signal into a luminosity signal and a color difference signal and then applies them with A/D conversion. The NTSC/CIF conversion unit 13 converts an NTSC signal into a CIF signal. The use of the intermediate format allows communication among any video cordic, regardless of the difference in their TV systems.
There are two types of intermediate formats to be defined in H. 261, one of which is CIF where a frame is coded into a luminosity component (Y) and two color difference components (CB and CR). The codes indicating these components are defined in CCIR Recommendation 601. An example of the luminosity component has 288 horizontal lines per frame, each line consisting of 352 pixels. An example of the two color difference components has 144 horizontal lines per frame, each line consisting of 176 pixels.
The other type is QCIF where the numbers of pixels and lines are reduced to half of CIF. The selection between CIF and QCIF is done by using an external unit such as TTC standard -H221. In the case shown in FIG. 1, CIF is selected through the NTSC/CIF conversion unit 13.
The pre-process filter 14 is a digital filter for deleting the noise of signals to be inputted.
The coding unit 6 is provided with a coding sub unit 4 and a coding control unit 5. The coding control unit 5 checks the presence of an available storage area in a transmission buffer 27 which will be described later. When there is a certain amount of available storage area, the coding unit 6 makes the coding sub unit 4 perform a coding operation and output coded picture data to the transmission buffer 27.
The coding sub unit 4 is provided with a motion compensation frame prediction unit 16, a DCT (discrete Cosine Transform) 17, a quantization unit 18, a first variable length coding unit 19, a second variable length coding unit 20, and a multiplexer 21.
The motion compensation frame prediction unit 16 divides each frame into blocks consisting of 16.times.16 pixels and detects a block that has the smallest difference with a target block from the latest-transmitted frame, the detection being carried out in a range between -15 pixels and +15 pixels of the target block. Then, the motion compensation frame prediction unit 16 finds the motion vector which indicates positional relation between these blocks. When the difference between the target frame and another frame is supposed to be detected, the latter frame is called a reference frame.
The motion compensation frame prediction unit 16 further detects a block that has the smallest difference with the target block, from the latest-transmitted frame per luminosity component and color difference component, and finds the difference between the target block and the detected block. As a result, a prediction error between these frames is outputted.
The DCT 17 cross-converts the luminosity component and color difference component of the difference between the block detected by the motion compensation frame prediction unit 16 and the target block, thereby converting space coordinate data into frequency coordinate data. The quantization unit 18 linear- quantizes the conversion coefficients obtained by the cross-conversion. The first variable length coding unit 19 huffman-codes the quantized conversion coefficients, and the second variable length coding unit 20 huffman-codes a motion vector used for motion compensation. The multiplexer 21 multiplexes the prediction error coded in the first variable length coding unit 19 with the motion vector coded in the second variable length coding unit 20, and consequently forms a frame to be transmitted to the transmission buffer 27.
The unillustrated decoding unit is provided with a frame memory including a storage area which accommodates picture data corresponding to at least two frames. The decoding unit inverse-quantizes and inverse-DCT-converts the coded prediction error. The frame memory has already obtained the latest-coded frame that has been already decoded, to which the decoded prediction error that has been inverse-quantized and inverse-DCT-converted is added in order to decode the frame which is being coded.
The transmission control unit 7, which is provided with a transmission buffer 27, a re-transmission control unit 28, and a re-transmission buffer 29, controls data transmission depending on transmission conditions of a transmission path. The first-in first-out transmission buffer 27 receives frames from the multiplexer 21 and sequentially transmits them. The coded frames which were unable to be displayed on the display of the receiver are abandoned sequentially. The retransmission control unit 28 stores coded frame under transmission to the re-transmission buffer 29 to re-transmit it upon a re-transmission request. A frame under transmission is a frame which is being transmitted. The re-transmission control unit 28 detects an occurrence of an error such as a burst error caused by fading in a wireless circuit, thereby re-transmitting the data according to ARQ (automatic Repeat Request). The display unit of a receiver requests data re-transmission to the transmitter, upon detecting an occurrence of a burst error. The picture signals sent from the transmission control unit 7 are transmitted through the transmission path 22.
The operation of the conventional picture data coding apparatus which has the construction shown in FIG. 1 will be described with reference to the time charts (a)-(f) in FIG. 2.
The time chart (a) shows the timing for transmitting each frame from the pre-process unit 15 to the coding unit 6 at a rate of thirty frames per second. In the time chart (b), arrows indicate the timing for sampling frames. The time chart (c) shows a time required for coding each frame. The required time varies depending on the amount of picture data of prediction errors. The time chart (d) shows a time required for transmitting coded frame. The time chart (e) shows a time required for decoding coded frame. The time chart (f) shows the timing for displaying decoded frame on the display unit of the receiver.
The frame which has been received first has a large amount of picture data because a prediction error cannot be obtained without any frame transmitted so far. This holds true when a frame is received after a long interval. In the picture data coding transmission at a low bit rate of 64Kbps, one frame cannot be transmitted within one-frame time, which means a time period between the input of a frame and the input of a subsequent frame. If the amount one frame is reduced by, for example, enlarging quantization values, the image quality is deteriorated, which makes pictures to be displayed very unclear.
In order to avoid this deterioration of the image quality, all frames received are not sampled but some frames are extracted for sampling at a certain time interval and coded for transmission. This is called frame skipping. Although this makes the movement of motion pictures to be displayed on the receiver look a little discontinuous, the amount of picture data to be transmitted per frame can be increased accordingly.
The display timing shown in the time chart (f) follows the ending point of a smoothing time period "T" which immediately follows a sampling operation performed by the coding unit 6. The smoothing time period "T" is provided for compensating the disperse in the time required for coding or decoding different picture data, or for compensating delay jitter caused by the disperse in the time required for transmitting coded data. The delay jittering means the disperse in the time period between a sampling time and a displaying time of each frame on the display unit of the receiver. The jittering results from that the times required for coding, transmitting, and decoding each frame are different depending on the amount of data. If they were displayed without any amendment, something moving at a certain speed in the original motion picture might move sometimes faster and sometimes slower than expected. The frame which has been decoded before the end of the smoothing time period "T" is held until the smoothing time period "T" expires and then displayed. On the other hand, those frames which were unable to complete their decoding operations by the end of the smoothing time period "T" are abandoned. In FIG. 2, the smoothing time period "T" is made eight-frame time.
Each process for coding, transmitting, and decoding frames is carried out block by block. Among the frames inputted as shown in the time chart(a), a target frame, which has been sampled at the timing shown in the time chart (b) is divided into blocks each having 16.times.16 pixels. Then, the coding control unit 5 checks the presence of an available storage area in the transmission buffer 27. When there is such an area, a coding operation shown in the time chart (c) is started per block. Then, coded blocks are sequentially stored in the transmission buffer 27, and started to be transmitted on a first-in first-out basis as shown in the time chart (d). The coded blocks thus transmitted are decoded by the decoding unit of the receiver. If the decoding operations are finished before the smoothing time period "T" expires, the decoded frames are held until their display timing pointed by the arrows and then displayed on the display unit of the receiver. Each circle in the time chart (F) represents a frame which have been displayed in this timing, and each cross indicates a frame which have not been displayed in this timing.
As shown in the time chart (f), the frames 1, 4, and 16 are displayed eight-frame time after a respective sampling operation, and the frame 10 is not displayed because its decoding operation has not been completed within the smoothing time period "T".
As mentioned above, H. 261, which allows the transmission of picture data at a low bit rate of 64Kbps, is suitable for audiovisual services such as video phone with digital telephone and video conference. However, transmission errors caused by, for example, fading are inevitable when motion pictures are coded via a portable wireless circuit such as a digital codeless phone. Such burst errors caused by fading in the movable wireless circuit reach three- to six-frame time when the moving speed of the circuit is lower than walking speed.
In FIG. 2, it is assumed that the frame 10 has the same amount of picture data corresponding to a prediction error as the frame 1. In other words, the frame 10 is assumed to need two-frame time for a coding operation, three-frame time for a transmitting operation unless there is a transmission error, and one-frame time for a decoding operation. It is further assumed that a burst error is caused by the above-mentioned fading over four-frame time while the frame 10 is under transmission.
As shown in the time chart (d), the frame 10 contains picture data corresponding to three-frame time but one-third of them have not been transmitted because of the occurrence of a transmission error. If the transmission error disappears during the remaining one-frame time for transmission and another one-frame time for a decoding operation, the picture data can be displayed. In other words, the frame 10 has a possibility of being displayed before the frame 16 is inputted, so that the transmission control unit 7 continues to transmit the frame 10 by the frame 16 is inputted.
The coded frames stored in the transmission buffer 27 after the transmission of the frame 10 is started are abandoned when the display of the frame 10 has been judged to be impossible, because these frames have been coded by using the frame 10 or an earlier frame as a reference frame. The frame 10 is also abandoned because it has not been transmitted yet. Consequently, when it is judged that the frame 10 cannot be displayed because of the transmission error, the frame 16 can be started to be coded. Since the conventional picture data coding apparatus is not provided with a memory for storing a latest-transmitted frame, the frame 16 is intra-coded. The intra-coding operation unit to code picture data in a frame without detecting the prediction error between the frame and an immediately preceding frame. The frame 16 is displayed on the display unit of the receiver at the end of eight-frame time.
Thus, according to the conventional picture coding apparatus, due to the transmission error corresponding to four-frame time, as many as 12 frames are skipped before the frame 16 is displayed on the display unit of the receiver after the frame 4 has been displayed. Such too much frame skipping makes motion pictures on the display unit discontinuous, which may result in that a man in the picture suddenly appears or disappears.
As another problem is that if inter-frame coding operation is performed after the occurrence of a burst error, the generation data amount of a frame under transmission becomes large. As a result, the time required for transmitting frames increases, thereby producing another frame skipping after the frame has been transmitted.
If the smoothing time period "T" provided for absorbing delay jitters is expanded, more frames can be in time for display. However, the time difference between actually inputted motion pictures and their display are increased, and consequently, smooth conversation in the bidirectional video communication system such as video conference or a video phone are spoiled.