1. Field of the Invention
This invention relates to a method and apparatus for communicating MPEG images from an encoder to a decoder, and in particular to the management of the encoder buffer to provide efficient data transfer while also precluding an overflow or underflow of the decoding buffer. Of specific interest is the management of the encoder buffer in the region of an MPEG Splice Point, a point in the stream of images at which an alternative encoder or decoder may be switched into the stream without introducing visually disturbing artifacts.
2. Description of Related Art
The MPEG standard defines a data format for the encoding of sequential video images in a compressed format, with sufficient timing information to allow for these images to be decoded and presented for viewing in the same order, and at the same rate, as the original, unencoded visual images. The sequential video images are comprised of frames, each frame typically being encoded at fixed intervals of time, and subsequently decoded and displayed at the same fixed interval of time, but delayed in time relative to the time of encoding.
In the art, the term "field" is also used to refer to each sequentially encoded image, often with regard to images intended to be displayed in an interlaced form. Similarly, the term "picture" is also used to refer to each encoded image. For simplicity, the term "frame" is used generically herein to refer to each encoded image. Similarly, the term MPEG is used generically herein, referring both to the formal data format specification, as well as the existing body of knowledge derived from this specification and its implementation, known to those skilled in the art.
Each MPEG frame may differ in size, each having a degree of compression dependent upon the image content and the content of other frames. To effect a constant encode and decode rate with varying sized frames, data buffers are provided at the encoder and at the decoder. Images are encoded into frames at a fixed frame rate and stored in the encoder buffer; the bits comprising the frames are communicated from the encoder buffer to the decoder buffer at a bit rate which is substantially independent of the frame rate, and usually constant; and the frames are unloaded from the decoder buffer at the fixed frame rate. The number of bits which a buffer can hold at any one time is termed the buffer size; the number of bits which a buffer is actually holding at a given time is termed the buffer occupancy. The use of buffers to allow variable sized frames to be transmitted continually and subsequently decoded at a constant frame rate provides for an optimized information transfer.
If the occupancy of the encoder is controlled to within specified encoder bounds, it can be shown that the occupancy of the decoder will necessarily be within a given set of decoder bounds. To preclude an underflow or overflow at the decoder, the decoder occupancy must remain within the bounds of zero and the decoder buffer size (Bd), respectively. The encoder bounds required to assure these decoder bounds are: ##EQU1## and EQU ELB(t)=EUB(t)-Bd (2)
where EUB(t) is the upper bound of the encoder buffer occupancy, and ELB(t) is the lower bound of the encoder buffer occupancy, at time t. Delta (.DELTA.) is the time between the encoding of a frame and its subsequent decoding, and, to maintain a constant display rate, is constant for a given encoder-decoder system. The transfer rate, r, may be variable or constant. For a constant transfer rate system having a transfer rate R, EUB(t)=R.DELTA., and ELB(t)=R.DELTA.-Bd. These bounds 201, 202 are shown in FIG. 2. As each frame is encoded, the size of the encoding is either zero filled or truncated so that the resultant encoder buffer occupancy lies within these bounds.
Note that these bounds are not specific to MPEG. MPEG is used herein to refer to a particular standard utilized to communicate a series of video images. As will be evident to one skilled in the art, however, the principles and techniques discussed herein are applicable to the switching of streams of any format via a system comprising an encoder and decoder, each having a buffer. Similarly, although the examples contained herein refer to frames of video images, the principles and techniques discussed herein are equally applicable to frames of audio passages, data packets, and the like.
The MPEG standard defines Seamless Splice Points, wherein the input to a decoder may be switched from a stream of frames from one encoder to a stream of frames from another encoder without introducing visual artifacts, such as incomplete frames, in the decoded image. The standard also requires that underflow and overflow of the decoder buffer is precluded, independent of whether the switch actually occurs. That is, the encoder buffer bounds must be such that, regardless of whether this encoder's stream continues or another encoder's stream is switched in, the decoder buffer will not overflow or underflow.
FIG. 1 shows a communications system comprising a decoder 150, and multiple encoders 110, 120, 130. Switch 140 selects one of the encoders to be connected to the decoder 150, thereby providing a source encoder of the subsequent stream of frames to the decoder. Each MPEG stream contains a suitable marking of the points in the stream whereat the switch 140 may effect the selection or deselection of the associated encoder as the source encoder. These marked points in the stream are termed splice points. The MPEG standard defines two parameters to form a seamless splice point, a Splice Decode Delay (SDD), and a Maximum Splicepoint Rate (MSR). These parameters correspond to a given minimum decoder buffer size, such that, if the encoders conform to these parameters, the decoder buffer of this minimum buffer size is assured not to overflow. The minimum buffer size is specified to be greater than MSR*SDD. The SDD is the time between the splice time (T.sub.sp) and the time of decoding (T.sub.D) the first frame after the splice point. The MSR is the maximum transfer rate that an encoder may operate at the splice point, and for a period SDD after the splice point.
To allow for seamless switching among encoders, the end of a frame from a first source encoder must occur at the start of a frame from the newly selected second source encoder. For a given encoder-decoder system, having a constant encode-decode delay of .DELTA., the required encode-to-splice delay E is thus seen to be equal to .DELTA.-SDD. To assure the appropriate encode-to-splice delay, the occupancy bounds of each encoder must be limited, so that the last bit of the frame just prior to the splice point, and the first bit of the frame just after the splice point, are switched at the appropriate time. As shown in copending U.S. patent application Ser. No. 08/829,124, in a constant bit rate system, with transfer rate R, the encoder occupancy at the time just prior to the encoding of the first frame after a splice point must be equal to the rate R times E. This is shown at 205 in FIG. 2b. Because the bits prior to the encoding of the last frame are unloaded from the buffer at the same rate R, in order to assure that the buffer occupancy is low enough to allow the occupancy to be at R*E at time Te, the frames prior to this splice point must be limited, as shown by the upper bound line segment 211 in FIG. 2b. That is, line segment 211 slopes at the rate -R, where R represents the rate at which the bits are unloaded from the encoder buffer. Because the frame just prior to the splice point must result in the occupancy R*E 205 at the splice point, and it is unloaded from the encoder buffer at a rate of R, it must have an occupancy which is equal to R*(F+E) 206, where F is the frame period, the time between the encoding of each frame.
The encoder transfer rate is also the rate at which the decoder buffer is loaded. As discussed in the aforementioned copending U.S. patent application Ser. No. 08/829,124, the decoder buffer may contain residual frames from one encoder while receiving frames from another encoder at a different rate. To assure that the decoder buffer does not overflow, each encoder must conform to the aforementioned MPEG specification, and each encoder's buffer bounds must be adjusted to accommodate the fact that the other encoder may be operating at a different rate. If an encoder's rate is equal to the maximum allowed rate, MSR, no adjustment is necessary. If an encoder's rate is greater than MSR, then it must be decreased to conform to the MPEG specification, with the resultant decrease in the buffer bounds, consistent with equations 1 and 2, above, shown as line segment 221, 222, 223, and 224 in FIG. 2c. If an encoder's rate is less than MSR, then it must assume a possible increase to MSR, via the other encoder after the splice point, with the resultant increase in the bounds, consistent with equation 1 and 2, above, shown as line segment 231 and 232 in FIG. 2d. It should be noted that the line segment 211 shown in 2b is lower than segments 221 and 231 in FIGS. 2c and 2d, and thus line segment 211 forms the actual upper bound for the encoder occupancy.
Conventionally, the MSR is selected in direct proportion to the minimum decoder buffer size, Bd, also specified by MPEG. To prevent buffer overflow, the MSR must be such that MSR*SDD&lt;=Bd. To maximize the allowable transfer rate, MSR is selected such that. MSR*SDD=Bd. FIG. 2e shows the resultant encoder buffer bounds for an idealized MPEG splice point, wherein MSR*SDD=BD, and the encoder rate, R, is equal to this maximized MSR. As can be seen, in the idealized case, the upper bound 211 before the splice point is severely limiting; the upper bound after the splice point returns to its normal R*Delta limit, and the lower bound remains predominantly level.
Thus, as can be seen, the MPEG definition of seamless splice points necessitates a more stringent set of encoder occupancy bounds. Each time that the encoding of a frame must be modified, by zero-filling or truncation to conform to the lower or upper bounds respectively, an inefficiency and/or loss of quality will result. Truncating a frame to conform to an upper bound requires the elimination of detail in the encoded frame. Transmitting zero-filled frames reduces the overall information transfer rate, because the zero filling contains no information. It also has the potential of forcing a loss of quality when subsequent frames must be truncated because the available space in the buffer was consumed by these zero filled bits. The more stringent the bounds on occupancy, the higher the probability of having to incur this loss of efficiency and degraded image quality.
The MPEG definition of seamless splice points also forces a loss of efficiency at each splice point, whenever the specified MSR is lower than the encoder's nominal transfer rate R, by forcing the encoder to reduce its rate at the splice point for a period equal to SDD. This inefficiency may also introduce a loss of quality, because, with a lower transfer rate, the likelihood of having to truncate subsequent frames is increased.
Note that the inefficiencies and quality degradations discussed above will be incurred regardless of whether the stream is actually spliced at the identified splice points. For these reasons, it is expected that not all potentially useful splice points will be created as such, and the advantages and flexibilities which could be obtained by having highly splicable MPEG streams will not be achievable.