1. Field of the Invention
The present invention relates to a coded signal transmission method and apparatus which can be used for encoding a digital signal at a variable bit rate at a transmission side and transmitting the signal to a reception side.
2. Description of the Prior Art
For television broadcasting of the next generation, there is a plan for digitalizing animation signals for realizing transmission of animation signals of a high quality. Here, if an animation signal is directly digitized, there is generated a great amount of data. In order to transmit the digitized data through a limited transmission line with high efficiency, the data should be encoded (data compression).
In general, however, an animation is a dynamic image which changes according a time lapse. Moreover, within a single image, a picture movement may greatly differ in the center and in the periphery of the image. Thus, the data amount generated when encoded by an encoder varies depending on the property of the image. In order to transmit this data at a constant transmission bit rate, a transmission buffer is provided at the last stage of an encoder system. That is, The encoded data changing in the amount generated is temporarily stored in the transmission buffer, from where the data is read out at a predetermined transmission bit rate and output to a transmission line.
FIG. 1 is a block diagram of a conventional encoder (hereinafter, referred to as an encoder system) having a constant output bit rate. In this encoder system of FIG. 1, a transmission buffer (hereinafter, referred to as an encoder buffer) 13 is provided between a transmission line and a video encoder 12 which is supplied with a video input through a terminal 11, for controlling the smoothing of the fluctuation of the bit amounts generated in a short period of time from the video encoder 12 so that the encoder buffer 13 can output a bit stream at a constant bit rate.
A rate controller 15 is supplied with an encoded picture generation bit amount S21 from the video encoder 12, a bit rate R from a terminal 16, and a decoder buffer size B from a terminal 10, and calculates a bit amount S22 to be assigned for the following picture to be encoded so as not to overflow or underflow the decoder buffer of the aforementioned size B, for example, according to a VBV (Video Buffering Verifier) model which will be detailed later. Data on the bit amount S22 calculated is transmitted to the video encoder 12 for specification.
The encoder buffer 13 is supplied with a video bit stream from the video encoder 12 and has a code buffer at least equal to the decoder buffer size B. Normally, the code buffer is included in the transmission buffer.
The bit stream outputted from the encoder buffer 13 is supplied to a multiplexer 14. Although not depicted, the multiplexer 14 is also supplied with an encoded bit stream of an audio signal. In the multiplexer 14, a plurality of input bit streams are system-encoded and multiplexed, and the multiplexed stream is outputted from a terminal 17.
Note that a start of output of the bit stream from the encoder buffer is specified by a start controller 19. This is illustrated in FIG. 1 in such a configuration that the start controller 19 controls a switch 20 provided at the output side of the encoder buffer 13. The start time is calculated, as will be detailed later, according to the aforementioned bit rate R and a data supplied from a terminal 18 on bit occupation amount b0 at a decoding start moment of the decoder buffer.
FIG. 2 is a block diagram of a conventional decoder (hereinafter, referred to a decoder system). A demultiplexer 26 is supplied with the multiplexed stream from a terminal 25. The video stream which has been isolated by the demultiplexer 26 is stored in a reception buffer (hereinafter, referred to as a decoder buffer) 27. The decoder buffer 27 serves to absorb a fluctuation of the bit amount which is read out by the video decoder 28 during a short period of time. Because the decoder system is a passive system for the bit stream supplied, in order to enable stable video reproduction by the video decoder 28, the encoder system should carry out encoding while controlling not to cause overflow or underflow of the decoder buffer 27.
As a representative animation encoding method, there is known the MPEG standard. The MPEG is an abbreviation of the Moving Picture Coding Experts Group for examination of animation encoding for storage, which group belongs to the ISO/JTC1/SC29 (International Organization for Standardization/ International Electrotechnical Commission, Joint Technical Committee 1/ Sub Committee 29). The MPEG1 Standard is the ISO11172 and the MPEG2 Standard is the ISO13818. Among these international standards, ISO11172-1 and ISO13818-1 are items for multimedia multiplexing; ISO11172-2 and ISO13818-2 are items for images; and ISO11172-3 and ISO13818-3 are items for voice.
This MPEG standard assumes an ideal I/O model of the decoder buffer 27 of the decoder system, and specifies an encoder system which carries out encoding while controlling not to overflow or underflow the decoder buffer 27, assuming the decoder buffer model (an ideal model of the decoder buffer). The I/O model of the decoder buffer 27 of the decoder system is described as VBV (Video Buffering Verifier) in the ISO/IEC 11172-2 Annex C or ISO/IEC 13818-2 Annex C.
The VBV buffer size of the decoder system is specified by the identifier "vbv.sub.-- buffer.sub.-- size" in the MPEG bit stream. The typical size, for example, is 1.75M bits at MP@ (Main Profile at Main Level)
The VBV of the decoder system is assumed to operate under the following ideal condition.
(1) A bit stream corresponding to a picture is instantaneously outputted from the decoder buffer and each of the pictures is decoded instantaneously.
When a bit stream is transmitted at a real time from the encoder system to the decoder system under this condition, the transmission buffer (encoder buffer) of the encoder system should operate under the following ideal condition.
(2) Each of the pictures is encoded instantaneously and a bit stream for the corresponding picture is inputted instantaneously to the encoder buffer.
Explanation will be given on a VBV model in which an encoder system and a decoder system operate at a real time via a transmission line. Here, the encoder system outputs a bit stream at a constant bit rate from the encoder buffer 13 as has been explained with reference FIG. 1. Consequently, the bit stream is inputted at a constant bit rate to the decoder buffer 27 of FIG. 2.
FIG. 3 shows an example of changes of the bit occupation amount of the buffers in the encoder system and the decoder system according to the VBV mode. In this FIG. 3, a straight line c-d divides the encoder system and the decoder system. That is, the change of bit occupation amount in the decoder buffer is shown at the right side of line c-d, and the change of bit occupation amount in the encoder buffer is shown at the left side of the line c-d.
The two of the horizontal axes t represent the time lapses: the upper time axis represents a time lapse in the encoder system and the lower time axis represents a time lapse in the decoder system. For simplification, the line c-d is shared by the encoder system and the decoder system as if there were no time difference between them. However, there exists a certain transmission line delay time D0 between the encoder system and the decoder system. Consequently, the time at point c which is the origin t=0 on the time axis of the encoder system becomes t=D0 on the time axis of the decoder system. The D0 includes the processing time of the multiplexer 14 in the encoder system, the transmission time, and the processing time of the demultiplexer 26 in the decoder system.
The vertical axis represents, in the encoder system, the accumulated bit amount of the bit stream outputted from the encoder buffer at a particular time, and in the decoder system, the accumulated bit amount of the bit stream inputted to the decoder buffer at a particular time.
The slope of the line c-d (.DELTA.d/.DELTA.t) can be viewed from the encoder system as a constant output bit rate R from the encoder buffer 13 and from the decoder system as a constant input rate R to the decoder buffer 27.
The vertical distance between the line c-d and the line e-f along the vertical axis represents a size B of the decoder buffer. The vertical distance between the line c-d and the line a-b represents a size B of the encoder buffer. The B is a constant value. The size of the encoder system is always identical to the size of the decoder system.
The A(n) represents an n-th coded picture, and its size represents the bit amount of the coded picture. As shown in FIG. 4, each of the pictures is encoded as an I picture, P picture, or B picture. The I picture is encoded by using its own image signal alone. The P picture is motion-compensative-predicted from an I picture or P picture immediately before and the prediction residue is coded. The B picture is motion-compensative-predicted from an I picture or P picture immediately before and immediately after, and the prediction residue is coded. The bit amount of each coded picture A(n) changes according to the picture type I, P, or B and the picture content.
Referring back to FIG. 3, the ETS(n) represents a time to encode the coded picture A(n). The interval of pictures to be encoded (i.e., ETS(n+1) -ETS(n)) is, for example, 1/29.97 seconds in the NTSC video signal, and 1/25 seconds in the PAL video signal. The DTS(n) represents a time to decode the n-th coded picture A(n). The interval of pictures to be decoded (i.e., DTS(n+1)-DTS(n)) is identical to the interval of pictures to be encoded.
In the encoder system, the region below the stepped zigzag trace in the figure represents the bit occupation amount change in the encode buffer. That is, the vertical distance from the time t on the line c-d to the stepped trace represents the bit occupation amount at time t. The vertical direction movement of the stepped trace represents an instantaneous input of a bit stream from the video encoder 12 to the encoder buffer 13, whereas the horizontal direction movement of the stepped trace indicates that no bit stream is inputted from the video encoder 12 to the encoder buffer 13 (no encoding is carried out), and the encoder buffer 13 outputs a bit stream at the bit rate R.
Description will now be directed to the change of the bit occupation amount in the encoder buffer of the encoder system.
Before time t=ETS(0), the bit occupation amount in the encoder buffer is zero. A data of the 0-th picture A(0) encoded at time t=ETS(0) is instantaneously inputted to the encoder buffer, which instantaneously increases the bit occupation amount of the encoder buffer by the bit amount of the aforementioned 0-th coded picture A(0). The encoder buffer starts output of a bit stream at t=0. The start is specified by the start controller 19 in the encoder system shown in FIG. 4. This start time do is calculated from the bit rate R and the bit occupation amount b0 at the decoding start of the decoder buffer as follows. EQU ETS(0)+do=0 EQU do=(B-b0)/R
For the time duration from t=0 to the encoding time ETS(1) of the next 1-st picture A(1), a bit stream is outputted from the encoder buffer at the bit rate R, which decreases the bit occupation amount of the encoder buffer as the time lapses. At the encoding time ETS(1), the 1-st picture A(1) is encoded and supplied to the encoder buffer, which instantaneously increases the bit occupation amount of the encoder buffer by the bit amount of the 1-st picture A(1). For the time duration from t=ETS(1) to ETS(2), a bit stream is outputted from the encoder buffer at the bit rate R, which decreases the bit occupation amount of the encoder buffer as the time lapses. In the same manner, picture encoding is continued at a predetermined time interval.
The bit occupation amount in the decoder buffer is changed according to the bit occupation amount change of the aforementioned encoder buffer. In the decoder system, the region above the stepped trace represents the bit occupation amount change. That is, the vertical distance from the time t on the line c-d to the stepped trace along the vertical axis represents the bit occupation amount of the decoder buffer at time t. The vertical direction movement of the stepped trace represents that the video decoder 28 instantaneously reads out a bit stream from the decoder buffer 27, whereas the horizontal direction movement of the stepped trace represents that the video decoder 28 reads out no bit stream from the decoder buffer 27 (no decoding is carried out) and a bit stream is inputted to this decoder buffer 27 at the bit rate R.
Description will now be directed to the change of the bit occupation amount of the decoder buffer in the decoder system.
At time t=D0, input of a bit stream to the decoder buffer at the bit rate R is started. When the time duration di, i.e., EQU di=b0/R
has elapsed, at time DTS (0), the 0-th picture A(0) is decoded
This moment of time di or time DTS(0) is specified by the bit stream received. The bit occupation amount of the decoder buffer is instantaneously decreased by decoding the 0-th coded picture A(0) at time DTS by the bit amount of the aforementioned 0-th coded picture A(0). Subsequently, during a following time duration up to the time DTS(1), a bit stream is inputted to the decoder buffer at the bit rate R, which increases the bit occupation amount of the decoder buffer as the time lapses. At time DTS(1), the 1-st coded picture A(1) is decoded, which instantaneously decreases the bit occupation amount of the decoder buffer by the bit amount of the 1-st coded picture A(1). In the same manner, picture decoding is continued at a predetermined time interval.
Here, T(i) represents a time interval (hereinafter, T will be referred to as a delay time) from the time ETS(i) when the i-th coded picture A(i) is encoded to the time DES(i) when the i-th coded picture A(i) is decoded. That is, EQU T(i)=DTS(i)-ETS(i)
In order to carry out a stable image reproduction at the side of decoder (reception) system, the aforementioned delay time T(i) should be a constant value for encoding/decoding of all the coded pictures. That is, EQU T=T(0)=T(1)=. . . =T(n)
Consequently, as shown in FIG. 3, the trace of the bit occupation amount of the decoder buffer is identical to the trace of the bit occupation amount of the encoder buffer which trace is advanced by the aforementioned delay time T (rightward along the horizontal axis).
Here, if it is assumed that B is the aforementioned buffer size, Oe(n) is the bit occupation amount of the encoder buffer immediately before encoding the n-th coded picture A(n), Ve(n) is the vacancy amount of the encoder buffer immediately before encoding the n-th coded picture A(n), Od(n) is the bit occupation amount of the decoder buffer immediately before decoding the n-th coded picture A(n), and Vd(n) is the vacancy amount of the decoder buffer immediately before decoding the n-th coded picture A(n), then the following relationships are satisfied. EQU Ve(n)=B-Oe(n) EQU Vd(n)=B-Od(n) EQU Oe(n)=Vd(n) EQU Ve(n)=Od(n) EQU B=Oe(n)+Ve(n) =Od(n)+Vd(n)=Oe(n)+Od(n)
That is, as shown in FIG. 5, control is carried out so that the sum of the bit occupation amount of the encoder buffer in the encoder system and the bit occupation amount of the VBV buffer (decoder buffer) of the decoder system be a constant value (corresponding to the buffer size B) via the aforementioned delay time T.
The delay time T is calculated by the following equation. EQU T=.tau.e(n)+.tau.d(n)+D0 EQU =Oe(n)/R+Od(n)/R+D0 EQU =B/R+D0
wherein D0 represents a transmission line delay amount (constant).
If "t" is assumed to be a period of time required for making the bit occupation amount of the encoder buffer from the aforementioned buffer size B to 0 at the output bit rate R or a period of time required for making the bit occupation amount of the decoder buffer from 0 to B at the input bit rate R, then the following relationships are satisfied. EQU .tau.=B/R=.tau.e(n)+.tau.d(n) (constant) EQU T=.tau.+d0 (constant)
Assuming this buffer model, the encoder system should carry out encoding and transmission, while taking consideration not to overflow or underflow the decoder buffer of the decoder system. The decoder system can carry out a stable picture decoding while the stepped trace of the decoder system is found between the line c-d and the line e-f without exceeding the buffer size B. If the stepped trace is found above the line c-d, it means that the decoder buffer is underflown, and if the stepped trace is below the line e-f, then it means that the decoder buffer is overflown.
The encoder system encodes an n-th picture A(k), assuming the bit occupation amount of the decoder buffer when the picture A(k) is decoded. Here, the k-th picture A(k) can generate a bit amount which satisfies the following condition.
[Equation 1]
R and B are constant when t&gt;D0
If k=0, EQU Od(0)=b0 (1) PA1 If k.gtoreq.1, ##EQU1## EQU Od(k)+R.times.(DTS(k+1)-DTS(k))-B.ltoreq.A(k).ltoreq.Od(k) (3) PA1 wherein A(k).gtoreq.0 PA1 wherein R1&gt;R2
In the encoder system of FIG. 1, the bit amount of the picture A(i) when i&lt;k corresponds to the bit amount S21 generated from the aforementioned video encoder 12. The rate controller 15, assuming the bit amount S22 to be assigned for the k-th picture A(k), specifies a value of the bit amount (size of the picture A(k)) which satisfies the aforementioned equation (3). The encoder system carries out such a control so as not to overflow or underflow the decoder buffer of the decoder system.
In the conventional technique as has thus far been described, there is no problem as long as the data transmission rate between the encoder system and the decoder system is a constant rate. However, if the data transmission is carried out at a variable bit rate, there arises a problem that the decoder system cannot obtain a stable picture reproduction. This problem will be detailed below with reference to FIG. 6.
Here, in the same way as the aforementioned conventional technique, the size of the encoder buffer of the encoder system is a constant and identical to the size B of the VBV buffer (decoder buffer) of the decoder system.
For transmission at a variable bit rate, for example, if the coding bit rate is changed from R1 to R2 when encoding the n-th picture A(n) and after, then, in synchronization with this, the encoder system changes the output bit rate R from the encoder buffer from R1 to R2. This change of the output bit rate R from the encoder buffer is shown in FIG. 6 as a change of the slope of line e-f to line f-g at time t=ETS(n). That is, the aforementioned output bit rate R and the coding bit rates R1 and R2 are in the relationships as follows. EQU R=R1:0.ltoreq.t&lt;ETS(n) EQU R=R2: ETS(n).gtoreq.t
In this case, the encoder system, according to the aforementioned equations (1), (2), and (3), assumes that the possible trace of bit occupation amount of the VBV buffer of the decoder system is found between the broken line e-f-p-q and the broken line h-i-r, and the encoder system is assumed to have carried out an encoding so that the trace of the bit occupation amount of the encoder buffer is found between the broken line e-f-g and the broken line a-b-d.
In this case, as is clear from FIG. 6, there arises an underflow of the decoder buffer when decoding the n-th picture A(n).
This is because, in the case of FIG. 6, the time interval between the moment of encoding of a picture and the moment of decoding to the picture is changed when the output bit rate R is changed from the coding bit rate R1 to the bit rate R2. That is, if T1 is assumed to be the time interval while R=R1, and T2 is assumed to be the time interval while R=R2, then EQU B=Oe(1)+Od(1) Oe(n)+Od(n)(Oe(n)=0 in FIG. 9) EQU When R=R1: T1=Oe(1)/R1+Od(1)/R1+D0+B/R1+D0. EQU When R=R2: T2=Oe(n)/R2+Od(n)/R2+D0+B/R2+D0.
Note that D0 is a transmission line delay time (constant) and R1&gt;R2. Consequently, T1&lt;T2.
As can be understood from FIG. 6, in the decoder system, for the pictures A(0) to the (n-1)-th picture A(n-1), the time interval T1 between encoding a picture and decoding of the picture is unchanged, and a stable decoding can be carried out.
A problem arises when decoding the picture A(n). That is, when decoding the picture A(n), the picture A(n) has not yet completely arrived at the decoder buffer at time t=(ETS(n)+T1), and an underflow is caused. At time t=(ETS(n)+T1) in FIG. 6, the trace X shows that the decoder buffer is underflown when decoding the picture A(n). The moment of time when the picture A(n) can be correctly decoded is time t=DTS(n)=ETS(n)+T2. Thus, during the time interval F from t=(ETS(n)+T1) to t=DTS(n), decoding cannot be carried out normally and accordingly, there arises a problem in image display. That is, for example, during this time interval, the picture A(n-1) which has bee decoded immediately before continues to be displayed, i.e., the display is frozen (still).