1. Field of the Invention
This invention relates to an encoded signal transmission method and apparatus suitable for encoding a digital signal at a variable bit rate on a transmission side and transmitting the encoded signal at the variable bit rate to a receiving side.
2. Description of the Related Art
In order to realize transmission of moving picture signals with high quality as the next-generation television broadcasting, a project of digitizing moving picture signals has been under way. In this case, since directly digitizing moving picture signals generates a huge amount of data, encoding of data (i.e., information compression) is required for efficient transmission of the data on a limited transmission line.
Meanwhile, in general, moving pictures are not stationary and the pattern and movement on a screen vary temporally. In some cases, the pattern and movement greatly differ between the center and peripheral portions of a picture within the screen. Therefore, the amount of generated information in encoding by an encoder varies, depending on such nature of the picture. To send such information at a constant transmission bit rate, a transmission buffer is prepared at the final stage of an encoder system. That is, an encoded output with the varying amount of generated information is temporarily stored in the transmission buffer and is read out and outputted to a transmission line at a predetermined transmission bit rate.
FIG. 1 is a block circuit diagram showing a conventional encoder with a constant output bit rate (hereinafter referred to as encoder system). In this encoder system shown in FIG. 1, a transmission buffer (hereinafter referred to as encoder buffer) 13 is provided between a video encoder 12 supplied with a video input via a terminal 11, and a transmission line. Thus, control is performed so as to smooth fluctuations in the amount of generated bits in a short period of time from the video encoder 12 and to output a bit stream from the encoder buffer 13 at a constant bit rate.
Information including the amount of generated bits S21 of the encoded picture from the video encoder 12, a bit rate R from a terminal 16 and a decoder buffer size B from a terminal 10 is inputted to a rate controller 15. On the basis of a video buffering verifier (VBV) model as later described, the rate controller 15 calculates the amount of allocated bits S22 of a picture to be encoded next, without causing overflow or underflow of a decoder buffer of the size B provided on the side of a decoder system. The rate controller 15 then sends and designates the information of the amount of allocated bits S22 to the video encoder 12.
The encoder buffer 13 supplied with the video bit stream from the video encoder 12 has a code buffer of a size equal to at least the decoder buffer size B. The code buffer is generally included in the transmission buffer.
The bit stream outputted from the encoder buffer 13 is inputted to a multiplexer 14. Although not shown, an encoded bit stream of an audio signal is also inputted to the multiplexer 14. The multiplexer 14 performs system encoding and multiplexing of a plurality of input bit streams, and outputs multiplexed streams from a terminal 17.
The start of output of the bit stream from the encoder buffer is indicated by a start controller 19. In the configuration shown in FIG. 1, a switch 20 provided on the output side of the encoder buffer 13 is controlled by the start controller 19. The start time is calculated from information including the bit rate R and the bit occupancy quantity b0 at the start of decoding of the decoder buffer from a terminal 18, as later described.
FIG. 2 is a block diagram showing a conventional decoder (hereinafter referred to as decoder system). A multiplexed stream from a terminal 25 is inputted to a demultiplexer 26, and the video bit stream split by the demultiplexer 26 is stored in a receiving buffer (hereinafter referred to as decoder buffer) 27. The decoder buffer 27 is adapted for absorbing fluctuations in the amount of bits read out in a short period of time by a video decoder 28. The decoder system is passive to the bit stream transmitted thereto. Therefore, in order to enable the video decoder 28 to perform stable video reproduction, the encoder system must encode data carefully enough to prevent overflow or underflow of the decoder buffer 27.
As a moving picture encoding method, the Moving Picture Experts Group (MPEG) standards have been known. The MPEG is an abbreviation of the group for examining moving picture coding for storage, of the International Organization for Standardization/International Electrotechnical Commission, Joint Technical Committee 1/Sub Committee 29 (ISO/IEC JTC1/SC29). The standards include ISO11172 as the MPEG1 standard and ISO13818 as the MPEG2 standard. In these international standards, ISO11172-1 and ISO13818-1 are provided as standards for multimedia multiplexing, ISO11172-2 and ISO13818-2 as video standards, and ISO11172-3 and ISO13818-3 as audio standards.
The MPEG standards prescribe an ideal input/output model of the decoder buffer 27 of the decoder system, and prescribes that on the assumption of the model of the decoder buffer (i.e., the ideal model of the decoder buffer), the encoder system should encode data carefully enough to prevent overflow or underflow of the decoder buffer 27. The input/output model of the decoder buffer 27 of the decoder system is described in the ISO/IEC 11172-2 Annex C or ISO/IEC 13818-2 Annex C, as a video buffering verifier (VBV) model. A buffer of the VBV model is referred to as a VBV buffer.
The VBV buffer size of the decoder system is indicated by an identifier "vbv_buffer_size" in the MPEG bit stream. The standard size is 1.75 Mbits, for example, in a main profile at main level (MP@ML).
The VBV of the decoder system is assumed to operate under the following ideal conditions.
(1) The bit stream for each picture is outputted instantaneously from the decoder buffer, and each picture is decoded instantaneously. PA0 (2) Each picture is encoded instantaneously, and the bit stream for each picture is inputted instantaneously to the encoder buffer.
Under these conditions, when the bit stream is transmitted from the encoder system to the decoder system on the real-time basis, the transmission buffer (encoder buffer) of the encoder system must operate under the following ideal conditions.
The VBV model in the case where the encoder system and the decoder system operate on the real-time basis via a transmission line as in broadcast or communication will now be described. In the encoder system, as shown in FIG. 1, the bit stream is outputted from the encoder buffer 13 at a constant bit rate. Therefore, the bit stream is inputted to the decoder buffer 27 shown in FIG. 2 at a constant bit rate.
FIG. 3 shows an example of changes in the bit occupancy quantity of the buffers of the encoder system and the decoder system in conformity with the VBV model. In FIG. 3, the area on the right side of a line c-d shows changes in the bit occupancy quantity of the decoder buffer, and the area on the left side of the line c-d shows changes in the bit occupancy quantity of the encoder buffer.
The horizontal axis t expresses the lapse of time. In this case, two time bases are drawn, with an upper time base expressing the lapse of time on the side of the encoder system and a lower time base expressing the lapse of time on the side of the decoder system. In FIG. 3, the line c-d is shared by the encoder system and the decoder system for simplification, and there is no time difference between the encoder system and the decoder system. Actually, however, a constant transmission line delay time D0 exists between the encoder system and the decoder system. Therefore, the time of a point c is the origin t=0 on the time base of the encoder system while it is t=D0 on the time base of the decoder system. The time D0 includes the processing time of the multiplexer 14 of the encoder system, the transmission time, and the processing time of the demultiplexer 26 of the decoder system.
The vertical axis expresses the cumulative value of the amount of bits of the bit stream outputted from the encoder buffer up to a certain time point on the side of the encoder system, and the cumulative value of the amount of bits of the bit stream inputted to the decoder buffer up to a certain time point on the side of the decoder system.
The slope (.DELTA.d/.DELTA.t) of the line c-d expresses the constant output bit rate R from the encoder buffer 13, when viewed from the side of the encoder system, and the constant input bit rate R to the decoder buffer 27, when viewed from the side of the decoder system.
The width between the line c-d and a line e-f in the direction of vertical axis expresses the size B of the decoder buffer, with B being constant. The width between the line c-d and a line a-b in the direction of vertical axis expresses the size B of the encoder buffer, with B being constant. The buffer sizes in the encoder system and the decoder system are constantly equal to each other.
A(n) expresses an n-th encoded picture and its size expresses the amount of bits of the encoded picture. Each picture is encoded as any one of I-picture, P-picture and B-picture, as shown in FIG. 4. An I-picture is encoded by intra-coding, that is, by using picture signals of itself alone. A P-picture is motion-compensated from an I-picture or a P-picture immediately before, and a prediction residue thereof is encoded. A B-picture is motion-compensated from I-pictures or P-pictures before and after the B-picture, and a prediction residue thereof is encoded. The amount of bits of each encoded picture (An) varies depending on the picture type of I, P or B and the picture pattern.
ETS(n) expresses the time point at which the n-th encoded picture A(n) is encoded. The interval between pictures to be encoded (i.e., ETS(n+1)-ETS(n)) is 1/29.97 seconds for video signals of the NTSC system, and 1/25 seconds for video signals of the PAL system. DTS(n) expresses the time point at which the n-th encoded picture A(n) is decoded. The interval between pictures to be decoded (i.e., DTS(n+1)-DTS(n)) is equal to the interval between pictures to be encoded.
On the side of the encoder system, the area on the lower side of a zigzag step-like locus in FIG. 3 expresses changes in the bit occupancy quantity of the encoder buffer. That is, the distance in the direction of vertical axis from a point at the time point t on the line c-d to the step-like locus expresses the bit occupancy quantity at the time point t. The movement of the step-like locus in the direction of vertical axis indicates that the bit stream is inputted instantaneously from the video encoder 12 to the encoder buffer 13. The movement of the step-like locus in the direction of horizontal axis indicates that the bit stream input from the video encoder 12 to the encoder buffer 13 is stopped (i.e., encoding is stopped) while the bit stream is outputted from the encoder buffer 13 at the bit rate R.
With respect to the encoder system, changes in the bit occupancy quantity of the encoder buffer are hereinafter explained.
The bit occupancy quantity of the encoder buffer is zero before a time point t=ETS(0). Data of the 0th picture A(0) encoded at the time point t=ETS(0) is instantaneously inputted to the encoder buffer, and thus the bit occupancy quantity of the encoder buffer is instantaneously increased by the amount of bits of the 0th encoded picture A(0). The output of the bit stream from the encoder buffer starts at t=0. This start is indicated by the start controller 19 of the encoder system shown in FIG. 1. The start time do is calculated as follows from the bit rate R and the bit occupancy quantity b0 at the start of decoding of the decoder buffer: EQU ETS(0)+do=0 EQU do=(B-b0)/R
From t=0 to the encoding time point ETS(1) of the first picture A(1) next to the 0th picture A(0), since the bit stream is outputted from the encoder buffer at the bit rate R, the bit occupancy quantity of the encoder buffer decreases with the lapse of time. At the encoding time ETS(1), since the first picture A(1) is encoded and supplied to the encoder buffer, the bit occupancy quantity of the encoder buffer is instantaneously increased by the amount of bits of the first picture A(1). From t=ETS(1) to ETS(2), since the bit stream is outputted from the encoder buffer at the bit rate R, the bit occupancy quantity of the encoder buffer decreases with the lapse of time. Similarly, encoding of pictures is continued at a predetermined interval.
Changes in the bit occupancy quantity of the decoder buffer depend on the above-described changes in the bit occupancy quantity of the encoder buffer. On the side of the decoder system, the area on the upper side of the step-like locus expresses changes in the bit occupancy quantity of the decoder buffer. That is, the distance in the direction of vertical axis from a point at the time point t on the line c-d to the step-like locus expresses the bit occupancy quantity of the decoder buffer at the time point t. The movement of the step-like locus in the direction of vertical axis indicates that the video decoder 28 instantaneously reads out the bit stream from the decoder buffer 27. The movement of the step-like locus in the direction of horizontal axis indicates that read-out of the bit stream from the decoder buffer 27 by the video decoder 28 is stopped (i.e., decoding is stopped) while the bit stream is inputted to the decoder buffer 27 at the bit rate R.
With respect to the decoder system, changes in the bit occupancy quantity of the decoder buffer are hereinafter explained.
Input of the bit stream to the decoder buffer starts at t=D0 at the bit rate R. At a time point DTS(0) after the lapse of the time period di, that is, EQU di=b0/R
the 0th encoded picture A(0) is decoded.
The time period di or the time point DTS(0) is indicated in the received bit stream. The bit occupancy quantity of the decoder buffer is instantaneously decreased by the amount of bits of the 0th encoded picture A(0) by decoding of the 0th picture A(0) at the time point DTS(0). Subsequently, until the next time point DTS(1), since the bit stream is inputted to the decoder buffer at the bit rate R, the bit occupancy quantity of the decoder buffer increases with the lapse of time. At the time point DTS(1), since the first encoded picture A(1) is decoded, the bit occupancy quantity of the decoder buffer is instantaneously decreased by the amount of bits of the first encoded picture A(1). Similarly, decoding of each picture is continued at a predetermined decoding time interval.
T(i) is a time interval from a time point ETS(i) at which the i-th encoded picture A(i) is encoded to a time point DTS(i) at which the encoded picture A(i) is decoded. That is, EQU T(i)=DTS(i)-ETS(i).
T is hereinafter referred to as delay time.
In order to perform stable picture reproduction on the side of the decoder system (i.e., receiving side), the delay time T(i) must be constant for encoding/decoding of all the encoded pictures. That is, EQU T=T(0)=T(1)= . . . =T(n).
Thus, the locus of the bit occupancy quantity of the decoder buffer is a locus which is caused to proceed to the future (horizontally translated to the right) by the delay time T from the locus of the bit occupancy quantity of the encoder buffer, as shown in FIG. 3.
On the assumption that B represents the buffer size, Oe(n) represents the bit occupancy quantity of the encoder buffer immediately before the n-th encoded picture A(n) is encoded, Ve(n) represents the quantity of free space of the encoder buffer immediately before the n-th encoded picture A(n) is encoded, Od(n) represents the bit occupancy quantity of the decoder buffer immediately before the n-th encoded picture A(n) is decoded, and Vd(n) represents the quantity of free space of the decoder buffer immediately before the n-th encoded picture A(n) is decoded, the following relations are obtained.
Ve(n)=B-Oe(n) PA1 Vd(n)=B-Od(n) PA1 Oe(n)=Vd(n) PA1 Ve(n)=Od(n) PA1 B=Oe(n)+Ve(n)=Od(n)+Vd(n)=Oe(n)+Od(n) PA1 .tau.=B/R=.tau.e(n)+.tau.d(n) (constant) PA1 T=.tau.+D0 (constant) PA1 where A(k)&gt;0 PA1 R=R1: 0.ltoreq.t ETS(n) PA1 R=R2: ETS(n).gtoreq.t PA1 where R1&gt;R2
Specifically, the sum of the bit occupancy quantity of the encoder buffer of the encoder system and the bit occupancy quantity of the VBV buffer (decoder buffer) of the decoder system is controlled to be a constant value (a value corresponding to the buffer size B) via the delay time T, as shown in FIG. 5.
The delay time T is calculated by the following equation, ##EQU1##
where D0 represents the amount of transmission line delay (constant).
On the assumption that r represents the time required for changing the bit occupancy quantity of the encoder buffer from the buffer size B to 0 when the output bit rate is R, or the time required for changing the bit occupancy quantity of the decoder buffer from 0 to B when the input bit rate is R, the following relations are obtained.
On the assumption of this buffer model, the encoder system must encode and transmit data carefully enough to prevent overflow or underflow of the decoder buffer of the decoder system. If the step-like locus on the side of the decoder system is between the line c-d and the line e-f so as not to exceed the buffer B, the decoder system can stably decode pictures. On the contrary, if the step-like locus is above the line c-d, underflow is generated in the decoder buffer. If the step-like locus is below the line e-f, overflow is generated in the decoder buffer.
When the k-th picture A(k) is to be encoded, the encoder system encodes the picture A(k), assuming the state of the bit occupancy quantity of the decoder buffer at the time when the picture A(k) is decoded. At this point, the amount of generated bits of the k-th picture A(k) must meet the following conditions.
R, B are constant for t.gtoreq.D0.
When k=0, EQU Od(0)=b0 (1)
When k.gtoreq.1, ##EQU2## EQU Od(k)+R.times.(DTS(k+1)-DTS(k))-B.ltoreq.A(k).ltoreq.Od(k) (3)
In the encoder system of FIG. 1, the amount of bits of the picture A(i) for i&lt;k corresponds to the amount of generated bits S21 from the video encoder 12. The rate controller 15 indicates the value of the amount of bits (size of the picture A(k)) satisfying the equation (3), as the amount of allocated bits S22 of the k-th picture A(k). By conducting such control, the encoder system encodes data without causing overflow or underflow of the decoder buffer of the decoder system.
In the above-described conventional technique, there is no problem in the case where the data transfer rate between the encoder system and the decoder system is a constant rate. However, when data is transmitted at a variable bit rate, a problem may arise such that the picture cannot be stably reproduced on the side of the decoder system. An example of such problem is described with reference to FIG. 6.
In this case, similar to the above-described conventional technique, the size of the encoder buffer of the encoder system is constant and equal to the size B of the VBV buffer (decoder buffer) of the decoder system.
In the encoder system in the case where data are transmitted at a variable bit rate, for example, the encoding bit rate is altered from R1 to R2 from when the n-th picture A(n) is encoded, and in synchronization with this alteration, the output bit rate R from the encoder buffer is altered from R1 to R2. In FIG. 6, the output bit rate R from the encoder buffer is shown with the slope changed at a time point t=ETS(n), which is a junction point between a line e-f and a line f-g. Specifically, the relation between the output bit rate R and the encoding bit rates R1, R2 is held as follows.
In this case, the encoder system assumes that an area where the locus of the bit occupancy quantity of the VBV buffer of the decoder system may possibly pass is between a bent line e-f-p-q and a bent line h-i-r in accordance with the relations of the equations (1), (2) and (3). It is now assumed that the encoder system has encoded data in such a manner that the locus of the bit occupancy quantity of the encoder buffer is between a bent line e-f-g and a bent line a-b-d, as shown in FIG. 6.
In this case, as clear from FIG. 6, the problem of underflow is generated in the decoder buffer when the n-th picture A(n) is decoded.
This is because, in the example of FIG. 6, the time interval from a time point of encoding a picture to a time point of decoding the picture varies between the case where the output bit rate R is equal to the encoding bit rate RI and the case where the output bit rate R is equal to R2. That is, if the time interval for R=R1 is expressed by T1 while the time interval for R=R2 is expressed by T2, the following relations are obtained.
When R=R1, EQU T1=Oe(1)/R1+Od(1)/R1+D0=B/R1+D0
When R=R2, EQU T2=Oe(n)/R2+Od(n)/R2+D0=B/R2+D0
Since DO is the transmission line delay time, which is constant, and R1&gt;R2, the relation of T1&lt;T2 is obtained.
As seen from FIG. 6, on the side of the decoder system, with respect to the 0th picture A(0) to the (n-1)th picture A(n-1), the time interval from encoding of the picture to the decoding of the picture is constantly T1 so that the picture may be stably decoded.
However, a problem arises when the picture A(n) is decoded. Specifically, when the picture A(n) is to be decoded, since the picture A(n) does not perfectly reach the decoder buffer at a time point t=(ETS(n)+T1), the problem of underflow occurs. From the locus of X in FIG. 6 in the case where the picture A(n) is to be decoded at the time point t=(ETS(n)+T1), it in understood that underflow is generated in the decoder buffer. The picture A(n) can be correctly decoded at a time point t=DTS(n)=ETS(n)+T2. Therefore, the decoder system cannot perform normal decoding during a period from the time point t=(ETS(n)+T1) to the time point t=DTS(n) as indicated by F in FIG. 6, thus causing a problem in picture display. Specifically, since display of the picture A(n-1) immediately before the last decoding is continued during this period, the problem of freezing (stationary) display occurs.
In view of the above-described problems, it is an object of the present invention to provide an encoded signal transmission method and apparatus for conducting control so as to prevent overflow or underflow of the decoder buffer of the decoder system in the case where digital signals are encoded on the side of the encoder system (i.e., transmission side) at a variable bit rate and are transmitted to the decoder system (i.e., receiving side) at a variable bit rate.