1. Field of the Invention
The present invention relates to a method and apparatus for encoding a digital signal, a recording medium used for storing the digital signal and a method and apparatus for transmitting the digital signal, which are suitable for recording a dynamic image signal or a sound signal on a recording medium such as magneto-optical disk or magnetic tape and reproducing these signals from the recording medium so as to display a video image on a monitor, or for transmitting the dynamic image signal or the sound signal through a transmission line from a transmitting side to a receiving side where video image or sound is reproduced, in video conference system, video telephone system, broadcasting equipment or the like.
2. Description of the Related Art
In recent years, in many cases, when picture signal or speech signal stored on a recording medium such as magneto-optical disk or magnetic tape is reproduced therefrom to display a video image with sound, or transmitted through a given transmission line from a transmitting side to a receiving side where video image or sound is reproduced, as used in video conference system, video telephone system or the like, these signals have been encoded according to a so-called MPEG system after subjected to A/D conversion.
Here, the afore-mentioned "MPEG" represents an abbreviation of Moving Picture Experts Group which is an organization for investigating the encoding of dynamic image to be stored, belonging to ISO/IEC, JTC1/SC29 (International Organization for standardization/International Electrotechnical Commission, Joint Technical Committee 1/Sub-Committee 29). ISO11172 is prescribed as MPEG1 standard while ISO13818 is prescribed as MPEG2 standard. In these international standards, the term "multiplexing of multi-media" is normalized in ISO11172-1 and ISO13818-1 while the term "picture image" is normalized in ISO11172-3 and ISO13818-3.
Since the picture signal and the speech signal are usually handled at the same time, it is general that a plurality of data including the picture signal, the speech signal and related information data are multiplexed so as to be recorded and transmitted together. When these signals are reproduced, the multiplexed data is separated or demultiplexed into individual kinds of data and then decoded to reproduce these data in a synchronous manner.
In the case where these data are multiplexed, the given number of picture signals and speech signals are individually encoded to produce encoded streams, and thereafter these encoded streams are multiplexed.
The multiplexed stream is prescribed in MPEG system (ISO/IEC13818-1 or ISO/IEC11172-1). In the following, the structure of the decoder model and the multiplexed stream prescribed in the MPEG system are explained. For simplicity, the explanation is made in the context of MPEG2 (ISO/IEC13818-1) program stream and MPEG1 system (ISO/IEC11172-1) stream. However, it will be appreciated that the principle for decoding the MPEG 2 program stream is equally applicable to decoding of MPEG2 system transport stream (ISO/IEC11172-1).
In the MPEG system, a virtual decoder model (STD: system target decoder) is prescribed. The multiplexed stream is defined therein such that the stream can be correctly decoded by the system target decoder (STD), i.e., it can be decoded without causing inappropriate operative condition of buffer such as overflow or underflow of data.
Next, the operation of the system target decoder (STD) is described. FIG. 1 illustrates a schematic arrangement of an example of the system target decoder STD. FIGS. 2A and 2B illustrate the structures of the MPEG2 program stream and MPEG2 transport stream, respectively.
The system target decoder STD 16 include therein a reference clock called a system time clock (STC) 16, which is put forward in a predetermined increment. On the other hand, MPEG2 system program stream is composed of a plurality of access units. The stream has a time information called a system clock reference (SCR) which is encoded in a region called a pack header, as shown in FIGS. 2A and 2B. When the time of STC 16 is equal to the SCR, the decoder read out a corresponding pack, i.e., a unit of the program stream at a predetermined rate, i.e., a rate which is encoded in "mux.sub.-- rate field" of the pack header.
The read-out pack is immediately separated or demultiplexed into respective elementary streams such as video stream and audio stream by means of a demultiplexer 11 depending upon a sort of each packet which is a unit of the pack. The respective demultiplexed elementary streams are input to corresponding decoder buffers, i.e., a video buffer 12 and an audio buffer 14.
The packet header has fields for time information which is called a decoding time stamp (DTS) or a presentation time stamp (PTS), as shown in FIGS. 2A and 2B. The time information represents decode time and presentation time of a decoding unit (access unit) of each elementary stream. Specifically, the PTS represents a time at which the access unit is displayed, and the DTS represents a time at which the access unit is decoded. However, in the case of the access unit whose DTS and PTS are equal to each other, only the data of the PTS is encoded. When the value of the STC is equal to the value of the DTS, the access unit input into the video buffer 12 or the audio buffer 14 are read out therefrom and supplied to respective decoders, i.e., a video decoder 13 or an audio decoder 15 so as to be decoded.
Thus, in the system target decoder STD, since the decode time information relative to the common reference clock (STC) 16 is encoded in the packet of each elementary stream, video data, audio data or other data can be reproduced in a synchronous manner.
In addition, upon multiplexing, it is required that the system clock reference SCR which defines a supply time of the pack to the system target decoder STD should be determined so as not to cause overflow or underflow of data in the buffers for the respective elementary streams in the system target decoder STD, and that the access units are packetized. Incidentally, the overflow means that the data supplied to the decoder exceeds a capacity of the buffer, while the underflow means that the access unit to be decoded is not completely supplied to the buffer at the decode time thereof.
In the foregoing, the MPEG2 program stream shown in FIG. 2A is explained. Incidentally, the MPEG2 transport stream as shown in FIG. 2B has the same structure as that of the MPEG2 program stream. A transport stream header as shown in FIG. 2B is constituted by four bytes from synchronous byte (syc.sub.-- byte) to continuity counter, which is prescribed in the afore-mentioned ISO/IEC13818-1. The clock reference and the decode time has the same meanings as those of the MPEG2 program stream shown in FIG. 2A.
The MPEG video data have a structural unit called Group of Pictures (GOP). The structural unit can be encoded independently, i.e., the encoding of the GOP can be done such that when the GPO is decoded, any picture involved in the preceding GOP is not required. Accordingly, if a plurality of video streams are present, they can be switched by GOP or GOPs as a unit for the switching.
In the following, there is considered the case where two different kinds of program streams, which have been encoded under the afore-mentioned conditions, i.e., under such conditions that the video stream is encoded every GOP, are independently multiplexed. At this time, however, there is such a limitation that the boundary of each GOP is not present within the video packet, namely video data of pictures immediately before and after the boundary of GOP does not exist within one video packet.
FIGS. 3A through 3C illustrate an example of the case where two program streams are independently multiplexed under the afore-mentioned conditions, and an example of the case where the two program streams are selectively switched from one to another and outputted. As shown in FIG. 3A, the data of GOP0 in video stream V0 are multiplexed over packs PK0 and PK1 of a program stream PS0, and the data of GOP1 of the video stream V0 are multiplexed over packs PK2 and PK3 of the program stream PS0. On the other hand, as shown in FIG. 3B, the data of GOP0 in video stream V1 are multiplexed over packs PK0 and PK1 and PK2 of program stream PS1, and the data of GOP1 of the video stream V1 are multiplexed over a pack PK3 of the program stream PS1.
The two program streams independently multiplexed as shown in FIGS. 3A and 3B, are stored on a common recording medium. If such a system that the thus-stored two program streams are outputted every pack or packs while selectively switching therebetween, for example, by using the read-out device 10 shown in FIG. 1, is now considered, the afore-mentioned independent GOP (Group of Pictures) arrangement enables the video data to be continuously reproduced in a seamless manner when the program streams outputted are switched at switching points.
For example, as shown in FIG. 3C, when the packs PK0 and PK1 of the program stream PS0 are read out and thereafter the pack 3 of the program stream PS1 is continuously read out, the GOP0 of the video stream V0 and then the GOP1 of the video stream V1 are inputted into the video buffer 12 shown in FIG. 1, so that the video image can be continuously reproduced even if it is switched between the video streams V0 and V1. In this example, although there is described the case where the two different program streams are stored in the recording medium, it will be appreciated that the same effects can be obtained when three or more program streams are used. Hereinafter, packs corresponding to these switching points between the GOPs are referred to as entry points.
Meanwhile, in the case where a plurality of program streams are stored in a recording medium and a read-out device has a function for reading out the program streams while being switched from one to another at entry points, there occasionally arises an inconvenience that these program streams cannot be correctly decoded by a decoder if such plural program streams to be stored on the recording medium are independently multiplexed as in conventional methods. This is caused by the following two reasons.
Reason I: Inconsistency of system clock reference (SCR):
The system clock reference (SCR) encoded in the pack header represents a read-out start time of the pack data inputted to the decoder. For this reason, the system clock references (SCR) of the adjacent two packs to be read-out and input to the decoder are needed to satisfy the following condition:
(SCR encoded in the latter pack).gtoreq.(SCR encoded in the former pack)+(transfer time of the former pack)!, namely, PA1 (SCR encoded in the latter pack),.gtoreq.(SCR encoded in the former pack)+(data length of the former pack)/(read-out rate)!
Accordingly, even though the afore-mentioned condition can be satisfied when the program stream PS0 is read-out in the order of PK0, PK1, PK2, PK3 . . . (namely, even though the individual program streams are multiplexed so as to satisfy the afore-mentioned condition), if the program streams are switched from one to another by the data encoded in the entry points, for example, such that the packs PK0 and PK1 of the program stream PS0 are first read out and then the pack PK3 of the program stream PS1 is read out, as shown in FIG. 3C, there occasionally arises such a problem that the afore-mentioned condition is no longer satisfied because the program streams PS0 and PS1 are multiplexed separately from each other. That is, when the program streams are read-out in the afore-mentoioned order, the system time clock (STC) upon the termination of reading-out of the former pack becomes larger than the value of the system clock reference encoded in the latter pack, so that it is impossible to read out the data of the latter pack.
Reason II: Inappropriate operative condition of buffer (Overflow and/or underflow of data in buffer):
When the program streams to be read-out are switched from one to another by the read-out device, the inappropriate operative condition of the decoder buffer such as overflow or under flow is likely to occur.
The afore-mentioned Reason II is explained in detail below by referring to FIGS. 4A through 4C. FIGS. 4A through 4C illustrate the transition in memory amount of the video decoder buffer occupied by the data. FIG. 4A shows the condition of the decoder buffer, where, for example as shown in FIG. 3A, the program stream PS0 is regularly read omit in the order of the packs PK0, PK1, PK2, PK3 and so on. In FIG. 4A, the region (a) represents the amount of data in the buffer which is occupied by the GOP0 of the video stream V0 and the region (b) represents the amount of data in the buffer which is occupied by the GOP1 of the video stream V0. FIG. 4B shows the condition of the decoder buffer, where, for example as shown in FIG. 3B, the program stream PS1 is regularly read out in the order of the packs PK0, PK1, PK2, PK3 and so on. In FIG. 4B, the region (c) represents the amount of data in the buffer which is occupied by the GOP0 of the video stream V1 and the region (d) represents the amount of data in the buffer which is occupied by the GOP1 of the video stream V1. Each of the program streams shown in FIGS. 4A and 4B is continuously formed. Therefore, the program streams are multiplexed such that the decoder buffer does not cause inappropriate operative condition thereof such as overflow or underflow. However, if the multiplexed program streams are read out while being switched from one to another by the read-out device such that the packs PK0 and PK1 of the program stream PS0 are first read out in this order and then the pack PK3 of the program stream PS1 is read out, as shown in FIG. 3C, the decoder buffer is supplied first with the data of the GOP0 of the video stream V0 and then with the data of the GOP1 of the video stream V1. As a result, the amount of data occupied by these GOP's in the decoder buffer is in the condition as shown in FIG. 4C. In FIG. 4C, the region (e) represents the amount of data occupied by the GOP0 of the video stream V0 and the region (f) represents the amount of data occupied by the GOP1 of the video stream V1.
When the data of the GOP1 of the video stream V1 is decoded, the read-out thereof is determined by the system clock reference (SCR) while the pulling-out of the data from the decoder buffer is determined by the decoding time stamp (DTS), so that timings of reading-out and pulling-out of the data from the decoder buffer are similar to (f) shown in FIG. 4C, thereby causing the over flow of the decoder buffer.
The present invention has been accomplished in view of the afore-mentioned problems encountered in the art.