1. Field of the Invention
The present invention relates to a data multiplexing method and a recording medium. For example, the invention relates to a data multiplexing method and a recording medium in which when a plurality of multiplexed streams each consisting of a plurality of signals of video, audio, and the like are input to a decoding apparatus and are reproduced therein in a switched manner, a failure in a buffer memory of the decoding apparatus can be prevented.
2. Description of the Related Art
FIG. 1 shows an example of configuration of a transmission/reception system for transmitting and receiving signals of video, audio, and the like. On the transmission side (coding apparatus), a video encoder 1 encodes an input video signal and an audio encoder 2 encodes an input audio signal. A multiplexer 3 multiplexes coded video data and audio data that are supplied from the video encoder 1 and the audio encoder 2, respectively.
A multiplexed signal that is output from the multiplexer 3 is recorded onto a recording medium 4 such as a recordable DVD (digital versatile disc) or transmitted on a transmission line 5.
On the reception side (decoding apparatus), a demultiplexer 6 demultiplexes coded and multiplexed data that is supplied via the recording medium 4 or the transmission line 5 into video data, audio data, and the like, i.e., data of respective types. A video decoder 7 decodes coded video data that is supplied from the demultiplexer 6 and outputs decoded video data. An audio decoder 8 decodes coded audio data that is supplied from the multiplexer 6 and outputs decoded audio data.
For example, on the transmission side, input signals of video, audio, and the like are respectively coded by the video encoder 1 and the audio encoder 2 and then supplied to the multiplexer 3. The coded video data and audio data are multiplexed (combined into single data), and then supplied to and recorded onto the recording medium 4 or transmitted on the transmission line 5 to the reception side.
On the reception side, coded and multiplexed video and audio data that was recorded on the recording medium 4 or has been transmitted on the transmission line 5 is supplied to the demultiplexer 6, where it is demultiplexed into video data, audio data, and the like, i.e., data of respective types. That is, video data and audio data are reconstructed. The reconstructed video data is supplied to the video decoder 7 and decoded therein. The reconstructed audio data is supplied to the audio decoder 8 and decoded therein. Decoded video data and audio data are output in a synchronized manner.
The MPEG (moving picture experts group) systems (ISO/IEC 13818-1 and ISO/IEC 11172-1) are international standards relating to schemes for multiplexing and demultiplexing coded data of video, audio, and the like in the above-described manner. In the following, coded data of video, audio, or the like is called an "elementary stream." In particular, coded data of a video signal is called a "video stream," and coded data of an audio signal is called an "audio stream." Further, multiplexed data of a plurality of elementary streams is called a "multiplexed stream."
To simplify the description, the following description is directed to only multiplexed streams of an MPEG1 (ISO/IEC 11172-1) system stream and an MPEG2 (ISO/IEC 13818-1) program stream. However, the techniques under consideration can also be applied to an MPEG2 transport stream.
FIG. 2 shows the structure of a multiplexed stream prescribed in the MPEG systems. As shown in FIG. 2, a multiplexed stream consists of a plurality of packs. Each pack is given a pack header, which describes such information as a SCR (system clock reference; described later) and a multiplexing rate (Mux.sub.-- rate; described later). Each pack consists of a plurality of packets; elementary streams of video, audio, and the like are inserted in an individual pack in a divided manner. Each packet is given a packet header, which describes a time stamp (described later) and other data.
FIG. 3 shows an example of configuration of a decoding apparatus to which a multiplexed stream demultiplexing method that is prescribed in the MPEG systems is applied. This demultiplexing method is a method using imaginary decoders and is called a STD (system target decoder) model. Its operation will be described below.
The STD model has, in its inside, a reference clock STC (system time clock) 11, which increases at every constant cycle. Each pack header of a multiplexed stream that is input to the STD model describes a system time reference value called SCR (system clock reference).
Input of a multiplexed stream to the STD model is controlled by using the STC and the SCR. That is, at an instant when a readout SCR value becomes equal to an STC value, input of the pack whose pack header describes the SCR value is started. The input rate at this time, which is called a multiplexing rate, is described in the pack header.
The demultiplexer 12 sorts out a plurality of packets of each input pack into respective kinds of packets (a video stream, an audio stream, and the like). Video streams are supplied to a buffer (decoding buffer) 13 while audio streams are supplied to a buffer (decoding buffer) 14.
Data of video streams, audio streams, and the like stored in the buffers 13 and 14 are output therefrom on an access unit basis (access unit: the decoding unit of an elementary stream) based on time information (time stamp) that is described in each packet header. Those data are decoded by a video decoder 15 and an audio decoder 16, and then output for reproduction. In the following, the video decoder 15 and the audio decoder 16 are simply called the decoders 15 and 16 when they need not be distinguished from each other.
There are two kinds of time stamps, i.e., DTS (decoding time-stamp) and PTS (presentation time-stamp). The DTS indicates a time when an access unit is output from the buffer 13 or 14 and decoded by the decoder 15 or 16. The PTS indicates a time when a decoded access unit is output for reproduction. The timing of decoding and output is controlled by comparing a DTS or PTS value with a STC value. It is assumed that there occurs no delay in transfer of each access unit from the buffer 13 or 14 as well as decoding in the decoder 15 or 16, that is, such operations are performed instantaneously.
In general, in given elementary streams in which DTS and PTS values are equal to each other, only the PTS value is described in a packet header. Examples of such elementary streams are MPEG audio streams (ISO/IEC 13818-3 and ISO/IEC 11172-3). In the case of MPEG video streams (ISO/IEC 13818-2 and ISO/IEC 11172-2), a decoding delay occurs depending on the kind of access unit. Therefore, in this case, time stamps of both DTS and PTS are described. A rearrangement buffer 17 is a buffer for temporarily storing an I-picture and a P-picture and rearranging data by performing delay control.
An SCR value is a sample value of a reference clock (time base) provided in the coding apparatus (including the multiplexer). The use of the SCR enables input/output control in each buffer (buffer 13 or 14) of the decoding apparatus (STD model). It is required that SCR values be properly set in the coding apparatus so that the decoding in the STD model can be performed without causing in an overflow (data supplied to a buffer exceeds its capacity) or an underflow (not all data of an access unit has not reached a buffer at a time point when it should be decoded) in any buffer.
An application example using a plurality of multiplexed streams is "multiple-path reproduction." The multiple-path reproduction is a function of performing "language credit" (language-dependent video reproduction), "director's cut" (a cut designated by a movie director such as a parental lock), "multi-angles" (pictures taken by a plurality of cameras), and the like, and realizes reproduction of multiple paths according to a user's selection by means of a single application.
FIG. 4 shows an example of multiple-path reproduction. This example is a case including three reproduction paths. Arrows in FIG. 4 indicate that one of the three reproduction paths (i.e., reproduction path-3) is selected for reproduction. Each reproduction path consists of a plurality of multiplexed streams each of which was generated based on an independent time base. Multiplexed streams MBa and MBc are common to all the reproduction paths, and selection can be made among three multiplexed streams MBb(1) to MBb(3) which are located between MBa and MBc. For example, reproduction path-1 is formed by continuous reproduction of the three multiplexed streams MBa, MBb(1), and MBc. Reproduction path-2 and reproduction path-3 are formed in similar manners.
For example, respective multiplexed streams of the multiple-path reproduction of FIG. 4 are recorded or transmitted according to an arrangement and an order shown in FIG. 5. The decoding apparatus controls input of the respective multiplexed streams in accordance with a selected reproduction path. FIG. 5 shows an example of a reproduction order that is employed when reproduction path-2 is selected in FIG. 4, and arrows indicate that data may be skipped on a multiplexed stream basis.
Where the input is controlled in the above manner, actually data of multiplexed streams are input to the decoding apparatus in order of MBa, MBb(2), and MBc, as shown in FIG. 6. In the following, a connection point between adjacent multiplexed streams is hereinafter called a "discontinuous point" of multiplexed streams.
The above-described STD model does not assume continuous input of multiplexed streams that were generated based on time bases independent of each other. Therefore, it is impossible to reproduce such multiplexed streams without occurrence of any failure in the buffers in the vicinity of discontinuous points. In view of this, it is conceivable to use an E-STD model shown in FIG. 7 which is an extended version of the STD model.
In the E-STD model shown in FIG. 7, switches 23, 27, 28, and 31 for enabling input of packs, decoding of respective elementary streams, and switching among reference clocks to be referred to in output control in the vicinity of a discontinuous point of continuously input multiplexed streams are added to the STD model of FIG. 3. The STC is input, as a reference clock, to terminal a of the respective switches 23, 27, 28, and 31 while STC-.alpha. is input to terminal b thereof.
However, it is assumed that in singly reproducing each of multiplexed streams (MBa and MBb) before and after a discontinuous point, connection is made of terminal a in all the switches 23, 27, 28, and 31 and the E-STD model operates in the same manner as the STD model.
When connection is made to terminal b in all the switches 23, 27, 28, and 31, a STC-.alpha. value is referred to as a reference clock as shown in FIG. 7. As described below, a proper value is set as a variable .alpha..
For example, the value of the variable a may be set so that a decoding time of the last access unit of a preceding video stream MBa plus its display (i.e., output) cycle coincides with a display (i.e., output) time of the first access unit of an ensuing video stream MBb plus the value of the variable .alpha..
The switch 23 operates to perform input control of respective packs of a multiplexed stream. At a discontinuous point of multiplexed streams, the connection of the switch 23 is changed to terminal b at a time instant when the last pack of a preceding multiplexed stream MBa is input to the demultiplexer 24. Thereafter, the input of respective packs is controlled by comparing a value of the reference clock (STC-.alpha.) that is input to terminal b with a SCR value that is described in each pack header of an ensuing multiplexed stream MBb.
The switch 27 operates to perform decoding control. At a discontinuous point, the connection of the switch 27 is changed to terminal b at a time point that is a time point when the last access unit of a video stream of a preceding multiplexed stream MBa is decoded by the video decoder 29 plus its decoding cycle. Thereafter, the decoding is controlled based on values of the reference clock (STC-.alpha.) that is input to terminal b and DTS values of video streams included in an ensuing multiplexed stream MBb.
The switch 31 operates to perform display control on video streams. At a discontinuous point, the connection of the switch 31 is changed to terminal b at a time point that is a time point when the last access unit of a video stream of a preceding multiplexed stream MBa is displayed plus its display cycle. Thereafter, the display is controlled based on values of the reference clock (STC-.alpha.) that is input to terminal b and PTS values of video streams included in an ensuing multiplexed stream MBb.
The switch 28 operates to control output of audio streams. At a discontinuous point, the connection of the switch 31 is changed to terminal b at a time point that is a time point when the last access unit of an audio stream of a preceding multiplexed stream MBa is displayed plus its display cycle. Thereafter, the output is controlled based on values of the reference clock (STC-.alpha.) that is input to terminal b and PTS values of audio streams included in an ensuing multiplexed stream MBb.
At an instant when the connections of all the switches 23, 27, 28, and 31 are changed to terminals b, the STC value is re-set to a value of the reference clock (STC-.alpha.) that is input to terminal b and, at the same time, the connections of all the switches 23, 27, 28, and 31 are changed to terminals a. Thereafter, the same control as in the case of the STD model is performed.
Even in the E-STD model, the buffer (decoding buffer) 25 or the buffer 26 (decoding buffer) in the STD model may fails at a discontinuous point of multiplexed streams. Such a case will be described below. It is assumed that multiplexed streams before and after a discontinuous point are denoted by MBa and MBb, and that they were generated based on independent time bases TBa and TBb.
In the following description, it is assumed that TBa(i), for instance, a time based on the time base TBa and represents a time when an ith access unit is output from a subject buffer. A time when data supply to all the buffers (i.e., buffers 25 and 26) is finished in singly reproducing the multiplexed stream MBa is assumed to be time TBa.sub.-- end which is based on the time base TBa. Further, a data supply start time in singly reproducing the multiplexed stream MBb is assumed to be time TBb.sub.-- start which is based on the time base TBb.
FIGS. 8A-8C show an example of an overflow in a buffer (buffer 25 or 26) of the E-STD model. The vertical axis represents the buffer occupation amount and the horizontal axis represents the time. FIGS. 8A and 8B relate to the same buffer of the E-STD model, and show variations in the buffer occupation amounts according to the time bases TBa and TBb in singly reproducing the multiplexed streams MBa and MBb, respectively.
FIG. 8A shows only an end portion of the multiplexed stream MBa while FIG. 8B shows only a head portion of the multiplexed stream MBb. It is assumed that as shown in FIGS. 8A and 8B each multiplexed stream was formed by multiplexing so as to assure that data is supplied without causing a failure in a buffer in singly reproducing it.
FIG. 8C shows a variation in the buffer occupation amount according to the time base TBa when the multiplexed streams MBa and MBb are reproduced continuously. As seen from FIG. 8C, at an instant (time TBa.sub.-- end) when data supply of the multiplexed stream MBa is finished, data supply of the multiplexed stream MBb is started. There is a problem that a buffer overflow may thereafter occur at time TBa.sub.-- overflow which is based on the time base TBa.
This is caused by the fact that the data supply of the multiplexed stream MBb is scheduled irrespective of (independently of) that of the multiplexed stream MBa; that is, in multiplexing to form the multiplexed stream MBb no consideration is made of how the buffer is occupied as data, particularly of an end portion, of the preceding multiplexed stream MBa is supplied.
FIGS. 9A-9C show an example of an underflow in a buffer (buffer 25 or 26) of the E-STD model. The vertical axis represents the buffer occupation amount and the horizontal axis represents the time. FIGS. 9A and 9B relate to the same buffer of the E-STD model as in the case of FIGS. 8A and 8B, and show variations in the buffer occupation amounts according to the time bases TBa and TBb in singly reproducing the multiplexed streams MBa and MBb, respectively.
FIG. 9A shows only an end portion of the multiplexed stream MBa while FIG. 9B shows only a head portion of the multiplexed stream MBb. It is assumed that as shown in FIGS. 9A and 9B each multiplexed stream MBa or MBb was formed by multiplexing so as to assure that data is supplied without causing a failure in a buffer in singly reproducing it.
FIG. 9C shows a variation in the buffer occupation amount according to the time base TBa when the multiplexed streams MBa and MBb are reproduced continuously. As seen from FIG. 9C, at an instant (time TBa.sub.-- end) when data supply of the multiplexed stream MBa is finished, data supply of the multiplexed stream MBb is started.
However, when the next access unit (ASb(1): the first access unit of the multiplexed stream MBb) is output at time TBa(n+1), not all data of the access unit ASb(1) has not reached the buffer. Therefore, there is a problem that a buffer underflow may occur at time TBa(n+1).
This is because the data supply of the multiplexed stream MBa is finished too late, which delays the start of the data supply of the multiplexed stream MBb, which in turn prevents the data of the first access unit ASb(1) of the multiplexed stream MBb from arriving in time. Time TBa(n+1) is time TBa(n) when the last access unit of the multiplexed stream MBa is output from the buffer plus its decoding cycle.