The present invention relates to a data multiplexer, a data multiplexing method, and a recording medium, and particularly to a data multiplexer, a data multiplexing method, and a recording medium that make it possible to reduce the amount of calculation for simulation of data occupancy rate of a virtual data buffer in a T-STD model, and to thereby readily generate a multiplexed transport stream conforming to the ISO/IEC13818-1 requirements.
When a video stream and an audio stream are multiplexed by an MPEG (Moving Picture Coding Experts Group/Moving Picture Experts Group) transport stream method, which is widely used in broadcasting and AV stream distribution, a multiplexer is required to multiplex the streams in transport packet format in units of 188 bytes so that a decoder for separating and decoding a multiplexed stream can separate and decode each of the streams on the basis of a T-STD (transport Stream System Target Decoder) model, which is a virtual decoder model specified by an MPEG system standard (ISO/IEC13818-1).
FIG. 1 shows a T-STD model. The T-STD model is provided with three buffers, that is, a three-stage buffer for a video stream formed by a transport stream buffer, a multiplex buffer, and an elementary stream buffer, a two-stage buffer for an audio stream formed by a transport stream buffer and a main buffer, and a buffer for system control. In the T-STD model, a rate of transfer between the buffers, size of each buffer and the like are defined precisely. While FIG. 1 shows only one video stream buffer, one audio stream buffer, and one system control buffer, the number of video stream buffers and audio stream buffers provided in FIG. 1 coincides with the number of channels of the corresponding elementary streams.
A multiplexed data stream inputted to the T-STD is instantly transferred to one of transport stream buffers TB11 to TBsys3 that corresponds to the data stream, according to whether the data of the multiplexed data stream is video data, audio data, or system control data (the attribute of data described in each packet is described in a PID (Packet Identification), which will be described later with reference to FIG. 2), and then buffered in the corresponding buffer. Size of the transport stream buffers TB11 to TBsys3 is defined as 512 bytes. It is specified that the transport stream buffers TB11 to TBsys3 must not overflow and must be emptied at least once a second.
A video elementary stream is supplied from the transport stream buffer TB11 to a multiplex buffer MB14 to be buffered in the multiplex buffer MB14. The video elementary stream is thereafter supplied to an elementary stream buffer EB15 to be buffered in the elementary stream buffer EB15, and then decoded by a decoder D16. When resulting frames of the video data are not arranged in indicated order, a reordering buffer O17 interchanges the frames into the indicated order, and outputs the frames. When the resulting frames of the video data are arranged in the indicated order, the video data is outputted as it is.
An audio elementary stream is supplied from a transport stream buffer TBn2 to a main buffer Bn8 to be buffered in the main buffer Bn8, and then decoded and outputted by a decoder Dn9. System data is supplied from the transport stream buffer TBsys3 to a main buffer Bsys10 to be buffered in the main buffer Bsys10, and then decoded and outputted by a decoder Dsys11.
A rate of transfer Rx1 of the video elementary stream from the transport stream buffer TB11 to the multiplex buffer MB14 is expressed by the following equation (1):Rx1=1.2×Rmax[profile,level]  (1)where Rmax[profile, level] is a parameter defined in ISO/IEC13818-2 and indicating an upper limit value of the transfer rate dependent on the profile and level of each video elementary stream.
Size MBS1 of the multiplex buffer MB14 at low level and main level is expressed by the following equation (2), whereas the size MBS1 of the multiplex buffer MB14 at high-1440 level and high level is represented by the following equation (3):MBS1=BSmux+BSoh+VBVmax[profile,level]vbv_buffer_size  (2)MBS1=BSmux+BSoh  (3)where BSoh is size of a virtual overhead buffer Soh (not shown in the figure) for buffering PES (Packetized Elementary Stream) packet overhead, and is defined by the following equation (4); and BSmux is size of an additional multiplex buffer Smux (not shown in the figure), and is defined by the following equation (5).BSoh=(1/750)×Rmax[profile,level]  (4)BSmux=0.04×Rmax[profile,level]  (5)
Also, VBVmax[profile, level] is a parameter defined in ISO/IEC13818-2 and indicating a maximum value of size of a virtual VBV (Video Buffering Verifier) buffer (not shown in the figure), and vbv_buffer_size is included for transmission in a sequence header of a video elementary stream.
Size MBSn of a multiplex buffer MBn under parameters limited in an ISO/IEC11172-2 bit stream is expressed by the following equation (6).MBSn=BSmux+BSoh+vbv_max+vbv_buffer_size  (6)
BSoh and BSmux in the equation (6) are represented by the following equations (7) and (8).BSoh=(1/750)×Rmax  (7)BSmux=0.004×Rmax  (8)
Rmax in the equations (7) and (8) and vbv_max in the equation (6) designate a maximum bit rate and maximum vbv_buffer_size defined in ISO/IEC11172-2, respectively.
The size of BSmux included in MBS1 is allocated to perform multiplexing through buffering operation. A buffer size remaining after the size allocation to BSmux can be used for BSoh and can also be used for initial multiplexing.
Methods of transferring a video elementary stream from the multiplex buffer MB14 to the elementary stream buffer EB15 include a leak method and a vbv_delay method.
Transfer rate Rbx1 in the leak method is expressed by an equation (9) at low level and main level and represented by an equation (10) at high-1440 level and high level, whereas transfer rate Rbx1 of a parameter bit stream limited by ISO/IEC11172-2 is expressed by an equation (11):Rbx1=Rmax[profile, level]  (9)Rbx1=Min{1.05×Res, Rmax[profile,level]}  (10)Rbx1=1.2Rmax  (11)where Res is transfer bit rate of the elementary stream, and Rmax is maximum bit rate of the bit stream limited by ISO/IEC11172-2.
In transferring data from the multiplex buffer MB14 to the elementary stream buffer EB15 by the leak method, when a PES packet payload is present in the multiplex buffer MB14 and the elementary stream buffer EB15 is not full, the PES packet payload is transferred from the multiplex buffer MB14 to the elementary stream buffer EB15 at a transmission rate of Rbx1. When the elementary stream buffer EB15 is full, the data is not removed from the multiplex buffer MB14. When a data byte is transferred from the multiplex buffer MB14 to the elementary stream buffer EB15, every PES packet header in the multiplex buffer MB14 immediately preceding the data byte is immediately removed and discarded. The data is not removed from the multiplex buffer MB14 when PES packet payload data is not present in the multiplex buffer MB14.
According to the vbv_delay method, on the other hand, vbv_delay coded and included in a video elementary stream precisely defines timing for transmitting coded video data from the multiplex buffer MB14 to the elementary stream buffer EB15. When the vbv_delay method is applied, the last byte of a picture start code of a picture j is transferred from the multiplex buffer MB14 to the elementary stream buffer EB15 at a time tdn(j)−vbv_delay(j), where tdn(j) is a time of decoding the picture j, and vbv_delay(j) is a delay time in seconds indicated in a vbv_delay field of the picture j.
Data bytes between the last bytes of consecutive picture start codes (including the last byte of the second start code) are fragmentarily transferred to the elementary stream buffer EB15 at a fixed trans-mission rate Rbx(j). The transmission rate Rbx(j) is defined for each picture j. The rate of transfer Rbx(j) of the data bytes to the elementary stream buffer EB15 is given by the following equation (12):Rbx(j)=NB(j)/(vbv_delay(j) vbv_delay(j+1) +tdn(j+1) tdn(j))  (12)where NB(J) is the number of data bytes between the last bytes of picture start codes of pictures j and J+1 (including the last byte of the second start code and excluding PES packet header bytes).
When data is transferred by the leak method, the multiplex buffer MB14 must not overflow, and must be emptied at least once a second.
When data is transferred by the vbv_delay method, the multiplex buffer MB14 must not overflow or underflow, and the elementary stream buffer EB15 must not overflow.
Next, transfer of audio data and system data will be described. The rate of transfer Rxa of an audio data stream from the transport stream buffer TBn2 to the main buffer Bn8 is expressed by the following equation (13), whereas the rate of transfer Rxsys of system data from the transport stream buffer TBsys3 to the main buffer Bsys10 is expressed by the following equation (14).Rxa=2×106 (bps)  (13)Rxsys=1×106 (bps)  (14)
Buffer size BSn of the main buffer Bn8 for buffering the audio data is represented by the following equation (15):BSn=BSmux+BSdec+BSoh=3584 (bytes)  (15)where BSdec is size of a virtual access unit decoding buffer (not shown in the figure), and BSoh is size of a virtual PES packet overhead buffer (not shown in the figure). These are conditioned by the following equation (16).BSdec+BSoh≦2848 (bytes)  (16)
Buffer size BSsys of the main buffer Bsys10 for buffering the system data is represented by the following equation (17).BSsys=1536 (bytes)  (17)
An access unit An(j) that has been present in the elementary stream buffer EB15 or the main buffer Bn8 longest of all access units buffered therein (a video access unit corresponds to a picture and an audio access unit corresponds to an audio frame) and every stuffing byte preceding the access unit An(j) at a time tdn(j) (the stuffing byte will be described later with reference to FIG. 2) are instantly removed at the time tdn(j). The time tdn(j) is defined in a DTS (Decoding Time Stamp) or PTS (Presentation Time Stamp) field.
In the case of system data, when even a single byte of data is buffered in the main buffer Bsys10, the data in the main buffer Bsys10 is removed at all times at a transport rate of Rbsys expressed by an equation (18).Rbsys=max{80000, transport_rate(i)×8/500}  (18)
A PES (Packetized Elementary Stream) packet is divided for transfer into fixed-length transport packets of 188 bytes (hereinafter referred to as TS packets) so that a decoder supplied with the PES packet may separate and decode the inputted multiplexed data by using the T-STD model described above. FIG. 2 shows structure of a TS packet.
The TS packet comprises a 4-byte header describing information for identifying contents of packet data, and a payload describing video, audio, and other data. Structure of the header will be described in the following.
A synchronizing byte is an 8-bit synchronizing signal, serving as data for the decoder to detect the head of the TS packet. A transport error indicator is a 1-bit flag indicating presence or absence of a bit error in the packet. A payload unit start indicator is a 1-bit flag indicating whether the payload of the TS packet contains the head portion of the PES packet or not.
A transport priority indicates priority among a plurality of TS packets having the same PID (Packet Identification). Specifically, among the packets having the same PID, a TS packet with “1” described in the transport priority has priority over a TS packet with “0” described in the transport priority.
A PID is 13-bit stream identifying information indicating the attribute of data described in the payload. For example, 0×0000 described in the PID indicates that information described in the payload is a program association table. The program association table describes the PID of the TS packet describing a program map table, in which the identification number of a program, a PID list of TS packets describing individual video, audio, and other streams and the like are described.
Transport scrambling control describes information on scrambling, that is, information of either no scramble, an Even key, or an Odd key. Adaptation field control indicates presence or absence of an adaptation field and a payload of the TS packet. A continuity counter is 4-bit count information for detecting whether part of the TS packets having the same PID were discarded along the way or not.
An adaptation field describes additional information on the individual information, and a stuffing byte (invalid data byte) for fixing the length of the TS packet is added to the adaptation field. In order to divide a PES packet into fixed-length TS packets, a stuffing byte needs to be added as required, and therefore the length of the adaptation table differs depending on the stuffing byte.
Adaptation field length is 8-bit information indicating a length of the adaptation field. A discontinuity indicator indicates whether there is continuity between the packet and a next packet having the same PID, or whether the system clock is reset or not. A random access indicator indicates a random access entry point, that is, a sequence header of video data or a starting point of a frame of audio data. An elementary stream priority indicator indicates whether the TS packet has the most important part of the TS packets having the same PID (for example an intra-coded slice in video data).
Five-flag data comprises five flags: a PCR flag indicating whether the adaptation field includes PCR (Program Clock Reference) (when the PCR flag is 1, PCR is present); an OPCR flag indicating whether the adaptation field includes OPCR (Original Program Clock Reference) (an OPCR flag of 1 indicates presence of OPCR); a splicing point flag indicating whether the adaptation field includes a splice countdown region (a splicing point flag of 1 indicates presence of a splice countdown region); a transport private data flag indicating whether the adaptation field includes a private data byte (a transport private data flag of 1 indicates presence of a private data byte); and an adaptation field extension flag indicating presence or absence of an extension field of the adaptation field (an adaptation field extension flag of 1 indicates presence of an extension field).
An optional table describes information specified in the five flags. PCR and OPCR each comprise two parts, that is, a base and an extension part, and serve as information for correcting or setting an STC (System Time Clock) serving as time reference in the decoder to a value intended by the decoder. A splice countdown indicates the number of TS packets having the same PID that remain to a splice point (a point indicating a break in data that can be spliced and edited). This enables change of data (for example change between programs and commercials) at compressed stream level. Transport private data length indicates a length of succeeding transport private data. Transport private data is not specified in ISO/IEC standards. Adaptation field extension length indicates a length of an extension field succeeding this area.
Three-flag data comprises three flags: an ltw flag indicating whether the extension field includes an ltw (legal time window) offset area or not; a piecewise rate flag indicating whether the extension field includes a piecewise rate or not; and a seamless splice flag indicating whether the extension field includes a splice type and DTS_next_au (decoding time stamp next access unit) or not.
The extension field describes information specified in the three flags. An ltw_valid flag is a 1-bit flag indicating whether the value of ltw_offset to be described below is valid or not. The value of ltw_offset is defined when the ltw_valid flag is 1, and indicates a reciprocal number of the offset value. A piecewise rate is a value defined when a piecewise flag is 1, and is a virtual bit rate of a TS packet having the same PID and succeeding this TS packet. A splice type is 4-bit data indicating a maximum value of splice rate and a splice decoding delay value. The value of DTS_next_au indicates decoding time of a first access unit after the splicing point.
In order to comply with the ISO/IEC13818-1 requirements mentioned above, a conventional multiplexer simulates data occupancy rate of each buffer in the T-STD model, and then generates a multiplexed stream such that the buffer will not overflow (or underflow) . However, since the T-STD model has many defined items as described with reference to FIG. 1, and the buffers are provided in multiple stages, it is not easy to perform simulation for the T-STD model.