1. Field of the Invention
The present invention relates generally to a video signal compressing method and apparatus, and a compressed data multiplexing method and apparatus, and more particularly to a video signal compressing method and apparatus, and a compressed data multiplexing method and apparatus used in a system in which an output or decode time stamp is multiplexed in a data stream to synchronize a plurality of compressed signals for transmission.
2. Description of the Related Art
The MPEG system standardized by the Working Group (WG) of the ISO/IEC JTC1/SC29 (International Organization for Standardization/International Electrotechnical Commission, Joint Technical Committee 1/Sub Committee 29) is oriented to a time-division multiplexing and transmission of a video signal, an audio signal and other data. For synchronism of the multiplexed signals with each other during time-division multiplexing and transmission, a time stamp is added to each decoding/reproduction unit of data, called xe2x80x9caccess unitxe2x80x9d, for each of the video signal, the audio signal and the other data. The time stamp acts as time management information, and indicates when each access unit is to be decoded by a particular decoder, and also includes reproduction/presentation management information called xe2x80x9cpresentation time stamp (PTS)xe2x80x9d and a decoding time management information called xe2x80x9cdecode time stamp (DTS)xe2x80x9d.
In a multiplexer operating in accordance with the MPEG system during an encoding process, an access unit information (AUI) is extracted from each of the video and audio encoders and signal multiplexing is performed based upon the extracted access unit information. The conventional MPEG multiplexer uses the following variables, for example, as AUI:
picture_size
picture_type
repeat_first_field
to calculate the time stamps PTS and DTS following the procedure shown in the flow chart shown in FIG. 8. These calculated time stamps are then added to a transmission stream.
At Step C1 in the flow chart in FIG. 8, initialization is performed as follows. A DTS initial value (init_DTS) and a next DTS (next_DTS) to be transmitted are set to be equal. xe2x80x9cixe2x80x9d is set to xe2x80x9c0xe2x80x9d. The value of i is a counting variable which counts up one for each access unit in the order of transmission. The last_IP_repeat_first_field is set to xe2x80x9c0xe2x80x9d. The last_IP_repeat_first_field is used to convert a 3-2 pull-down image into an inverse 3-2 pull-down image for encoding. This variable is indicative of the value (0 or 1) repeat_first_field (AUI) of a preceding I picture or P picture. Num_of_field_next_DTS is set to xe2x80x9c0xe2x80x9d. The num_of_field_next_DTS is the time, counted in the unit of fields, until a next DTS. Num_of_field_diff_D_P is set to xe2x80x9c0xe2x80x9d. The num_of_field_diff_D-P is a difference between DTS and PTS values, counted in the unit of fields, for an I or P picture.
The above variables can be calculated from access unit information AUI for a particular encoded data (picture_size, picture_type, repeat_first_field). The operation then proceeds to Step C2.
At Step C2, an access unit information (AUI) for the i-th access unit is acquired. At the next Step C3, a function calc_TimeStamp_infi( ) is performed following a specified procedure that will be discussed below to determine values of num_of_field_next_DTS and num_of_field_diff_D_P. The operation then proceeds to Step C4.
At further Step C4, DTS[i] is set to be equal to next_DTS. Also, PTS[i] is set to be equal to DTS[i]+SCFR/2 num_of_field_diff_D_P as will be described below. Thus DTS and PTS are determined for the i-th access unit. The operation then proceeds to Step C5. At the next Step C5, next_DTS is set to be equal to SCFR/2xc3x97num_of_field_next_DTS+DTS[i]. The value of a next DTS is calculated in this manner. The operation then proceeds to Step C6.
At Step C6, the value of the variable i is increased by one. Next, the operation returns back to Step C2 where an AUI for a next access unit is acquired and the above processing is repeated to determine the values of DTS and PTS for the next access unit. The function calc_TimeStamp_infi( ) noted at Step C3 in FIG. 8, is performed according to the procedure shown in the flow chart in FIG. 9.
More particularly, at a first Step D1 in the flow chart in FIG. 9, it is determined whether 3xe2x80x942_pull_down_flag is set to 1 or not. The 3xe2x80x942_pull_down_flag is indicative of whether an inverse 3-2 pull-down process has been performed. If the 3xe2x80x942_pull_down_flag=1, encoding has been performed after an inverse 3-2 pull-down process has been performed, while if the 3xe2x80x942_pull_down_flag is equal to 0, encoding has been performed without an inverse 3-2 pull-down process. When the result of the determination at Step D1 is answered in the affirmative, namely, when 3xe2x80x942_pull_down_flag is equal to 1, the operation proceeds to Step D2. However, when the result of the determination is answered in the negative, namely, when 3xe2x80x942_pull_down_flag is equal to 0, the operation proceeds to Step D6.
At Step D2, it is determined whether the value of M is equal to 1. The M value is indicative of a number of pictures until a next I or P picture. When the result of the determination is answered in the affirmative, namely, when the M value is equal to 1, operation proceeds to Step D3. However when the result of the determination is answered in the negative, namely, if the value of M is not 1, operation preceeds to Step D4.
At Step D3, the value of num_of_field_next_DTS of the access unit is set to 2 and num_of_field_diff_D_P of the access unit is set to 0. Operation then proceeds to Step D4.
At Step D4, it is determined whether picture_type is a B picture. When the determination is answered in the affirmative, namely, picture_type represents a B picture, operation proceeds to Step D3. If the determination is answered in the negative, namely if picture_type does not represent a B picture, operation proceeds to Step D5.
At Step D5, the value of num_of field_next_DTS of the access unit is set to 2 and num_of_field_diff_D_P of the access unit is set to M valuexc3x972. The operation then proceeds to the return end.
At Step D6, it is determined whether M value is equal to 1. When the determination is answered in the affirmative, namely, when M value is equal to 1, operation proceeds to Step D7. If the determination is answered in the negative, namely, if M value is not 1, operation proceeds to Step D8.
At Step D7, the value of num_of_field_next_DTS of the access unit is set to 2+repeat_first_field and num_of_field_diff_D_P of the access unit is set to 0. Then the operation proceeds to the return end.
At Step D8, it is determined whether picture_type represents a B picture. When the determination is answered in the affirmative, namely, picture_type represents a B picture, operation proceeds to Step D7. If the determination is answered in the negative, namely, if picture_type dose not represent B picture, operation proceeds to Step D9.
At Step D9, it is determined whether the M value is even or not. If the determination is answered in the affirmative, namely, when the M value is even, operation preceeds to Step D10. When the determination is answered in the negative, namely, if the M value is not even, operation proceeds to Step D11.
At step D10, the value of num_of_field_next_DTS is set to 2+last_IP_repeat_first-field and the value of num_of_field_diff_D-P is set to 5xc3x97(M value /2). The operation then proceeds to step D12.
At Step D11, the value of num_of_field_next_DTS of the access unit is set to 2+last_IP_repeat_first_field and num_of_field_diff_D_P of the access unit is set to 5xc3x97(M/2). The operation then proceeds to Step D12.
At Step D12, the value of last_IP_repeat_first_field is set equal to repeat_first_field. This value is used as the timing for a next I or P picture.
Thus a time stamp for a video signal is calculated for each access unit in accordance with the attributes of the access unit and is added as a time stamp to the transmitted data stream.
FIGS. 10A to 10E show how an image is input to and output from the standard MPEG system.
As is shown in FIG. 10A, when supplied with a 3-2 pull-down image, for example, the encoder will take every 2 or 3 fields as one picture (in an alternating fashion). The input image having the following picture sequence:
B0, B1, 12, B3, B4, B5, B6, B7, P8, . . .
is encoded in the order of the following sequence:
I2, B0, B1, P5, B3, B4, B8, B6, B7, . . .
as shown in FIG. 10B. In the above sequences, I represents an intra-coded picture (I-Picture), P represents a predictive coded picture (P-Picture) and B represents a bidirectionally predictive coded picture (B-Picture).
FIG. 10B depicts the level of information stored in the output buffer of the encoder according to the system time clock (STC). At STC=14, a first I2 picture is encoded, and an encoded bit is stored into the buffer. STCs are shown at intervals of 1 field. One-field interval is equivalent to SCFR/2. The one-field interval is simplified to take a value of SCFR/2=1.
As shown in FIG. 10C, nearly upon completion of the encoding of one picture, AUI (picture_size, picture_type, repeat_first_field) is output from the encoder. In the multiplexer, the AUI value is acquired and calculated to determine each of the time stamps DTS and PTS, as noted above in accordance with the flow charts of FIGS. 8 and 9.
FIG. 10D depicts the level of information stored in the buffer in the decoder according to STC. At STC=18, a bit stream starts being acquired into the buffer and a first I2 picture is decoded at STC=33 according to the decode time stamp DTS.
FIG. 10E shows how an image is output, the 12 picture decoded at STC=33 is presented for output at STC=40 according to the presentation time stamp PTS. If a sequence extension parameter (sequence_extension) exists immediately after a sequence header (sequence_header) being a start code, the bit stream is determined to be an MPEG-2 bit stream generated by encoding the video sequence in accordance with the MPEG-2 standard. On the other hand, if no sequence extension exists immediately after the sequence header, the bit stream is determined to be an MPEG-1 standard bit stream.
The sequence extension contains a profile_and_level_indication parameter identifying a profile defining a mechanical configuration of a coding/decoding algorithm and a level defining a usable range in each profile, a progressive sequence parameter identifying whether a video sequence is a progressive scan or interlace scan, and a chroma_format parameter identifying the thinning type of a chrominance component, 4:2:0, 4:2:2 or 4:4:4, etc. There is also a flag (low-delay) for implementing a low delay. When this flag is set, the bidirectionally predictive_coded picture (B-Picture) will not be included in the video sequence so that pictures can be aligned with no delay in presentation of a decoded image.
FIGS. 11A to 11E depict how an image is input and output when a big_picture is to be developed with the above-mentioned flag (low_delay) set. When supplied with a 3-2 pull-down image, the encoder will take every 2 or 3 fields as one picture in an alternating fashion as shown in FIG. 11A. The input image having the following picture sequence:
I0, PI, P2, P3, P4, P5, P6, P7, P8, . . .
is encoded in the order of the following sequence:
I0, PI, P2, P3, P4, P5, P6, P7, P8, . . .
as shown in FIG. 11B. FIG. 11B shows an information level of the output buffer in the encoder according to the system time clock (STC). At STC=16, a first 10 picture is encoded, and an encoded bit is stored into the buffer. STCs are shown at intervals of 1 field. One-field interval is equivalent to SCFR/2.
As shown in FIG. 11C, nearly upon completion of the encoding of one picture, AUI (picture_size, picture_type, repeat_first_field) is outputted from the encoder. In the multiplexer, the AUI value is acquired and calculated to determine each of the time stamps DTS and PTS in accordance with the procedure set forth above with respect to FIGS. 8 and 9. As noted above, because a flag (low_flag) for implementing a flag (low- delay) is set, a bidirectionally predictive_coded picture (B-Picture) will not be included in the video sequence so DTS=PTS.
In this example, a big_picture is to be encoded in an 10 picture encoded at STC=38, so that the output buffer will overflow. However, this overflow is a buffer size for decoding by the decoder. The actual encoder has a larger buffer size, so the overflow will not cause any trouble in transmission. Since a big picture has been encoded at STC=38, a next PI picture to be encoded is not encoded but is instead removed.
By calculating the time stamps PTS and DTS following the conventional procedure shown in FIGS. 8 and 9, the time stamps PTS and DTS of an 10 picture in which a big_picture has been encoded will be:
PTS=55
DTS=55
FIG. 11D shows the information level of the buffer in the decoder according to STC and FIG. 11E when an image is output. As shown in FIGS. 11A-11D at STC=18, a bit stream starts being acquired into the buffer. The first 10 picture is decoded at STC=33 according to the decode time stamp DTS, and presented at STC=33 according to the presentation time stamp PTS. However, if a picture is decoded at PTS and DTS calculated following the conventional processing procedure, the output buffer will underflow when decoding a 10 picture in which the above-mentioned big picture (big_picture) has developed, as shown in FIG. 11D.
According to C.7 of the MPEG-2 standard (ISO/IEC 13818-2:1995(E), ITU-T REeC., in case that the big picture (big_picture) is to be developed in a state that the flag (low_delay) is set, the temporal reference (temporal_reference) of the next picture is checked and no decoding of the next picture is performed until sufficient data are accumulated in the buffer. After the data has been accumulated in the buffer, the data in the buffer are decoded. However, it is difficult to effect this operation by a hardware, and the underflow will take place.
When the underflow takes place, only the lower portion of an image will not be updated and the image will momentarily be disturbed.
Some conventional decoders used in the MPEG system automatically detect a 3-2 pull-down in themselves and automatically get into and out of inverse 3-2 pull-down. In this case, the PTS and DTS values will be unfavorable for correct time management unless 3xe2x80x942_pull_down_flag in Step D1 in the conventional procedure in the flow chart shown in FIG. 9 is changed.
Further, some conventional encoders freely change the M value in the course of an encoding. In this case, the PTS and DTS values will be unfavorable for correct time management unless the M values at Steps D5 to D7, D10 and D11 in the flow chart in FIG. 9 are changed.
Accordingly, it is an object of the invention to provide a video signal compressing method and apparatus, and a compressed data multiplexing method and apparatus, adapted to avoid a decoder underflow caused by the occurrence of a big picture.
Another object of the invention is to provide a video signal compressing method and apparatus, and a compressed data multiplexing method and apparatus, adapted to give appropriate time stamps PTS and DTS for an encoder which automatically detects a 3-2 pull-down within itself and automatically gets into and out of inverse 3-2 pull-down.
A further object of the invention is to provide a video signal compressing method and apparatus, and a compressed data multiplexing method and apparatus, adapted to calculate appropriate time stamps PTS and DTS for an encoder which freely changes the M value during the course of an encoding process.
Still other objects and advantages of the invention will in part be obvious and will in part be apparent from the specification and drawings.
Generally speaking, in accordance with the invention, a video signal compressing method and a video signal compressing apparatus for compressing a video signal for a system in which for a synchronism between a plurality of compressed signals, a presentation time stamp and/or a decode time stamp is multiplexed in a stream for transmission are provided in which a video signal is compressed to generate a compressed signal and the following are generated as access unit information (AUI) for use to multiplex the compressed data of the video signal:
picture_size,
picture_type,
repeat_first_field, and
a flag indicating the occurrence of a big picture (big_picture_flag).
Furthermore, a compressed signals multiplexing method and a compressed signal multiplexing apparatus for multiplexing compress signals for a system in which for a synchronism between a plurality of compressed signals, a presentation time stamp and/or a decode time stamp is multiplexed in a stream for transmission are provided in which a video signal is compressed to generate a compressed signal and the following are as access unit information (AUI) for use to multiplex the compressed data of the video signal:
picture_size,
picture_type,
repeat_first_field, and
a flag indicating the occurrence of a big picture (big_picture_flag).
The presentation time stamp and/or the decode time stamp is determined based on the picture_size, picture_type, repreat_first_field and big_picture_flag to multiplex the information in a stream.
The invention accordingly comprises the several steps and the relationship of one or more of such steps with respect to each of the others, and the apparatus embodying features of construction, combination of elements and arrangement of parts which are adapted to affect such steps, all as exemplified in the following detailed disclosure, and the scope of the invention will be indicated in the claims.