1. Field of the Invention
The present invention relates to a method and an apparatus for supplying an image material such as a broadcasting program, and more particularly to the method and the apparatus for inserting an image material such as a commercial film in the main image material such as a broadcasting program.
2. Description of the Related Art
In recent days, the so-called MPEG standard system has been proposed as a technique of compressing and coding a moving picture signal. The MPEG (Moving Picture Image Coding Experts Group) means an organization for studying coding of a moving picture image to be stored. This organization is an abbreviation of a working department of experts for studying standardization of the technique of compressing a moving picture. This organization is established in 1988 under the control of the International Standardization Organization (ISO) and the International Electrotechnical Commission (IEC). The technique of compressing data of a moving picture and speech, standardized by the organization, is referred to as an MPEG system.
The MPEG standards consist of the MPEG1, which is a phase 1 on the standardizing work, and the MPEG2, which is a phase 2 thereon. The difference between them will be briefly described below. The MPEG1 is the standards mainly for storage medium such as a CD-ROM, while the MPEG2 is the standards covering a wide range of medium including an application program of the MPEG1.
The MPEG2 output stream is divided into two types of systems, one of which is referred to as a program stream (MPEG2-PS, PS: Program Stream) and the other of which is referred to as a transport stream (MPEG2-TS TS: Transport Stream). The program stream is intended for the storage medium like the MPEG1. The transport stream is intended for a transmission medium.
This MPEG2 system has a function of transforming plural programs into one stream (data train). Hence, it may correspond to a TV broadcasting program, for example. Further, it allows for free organization of programs and further provides an expansion function and an additional function for various applications. To realize those functions, there are provided directory information for facilitating random access and type information for representing a type of each stream.
The MPEG system has the following flow from coding to decoding.
In the flow of coding in an encoder, a video signal and an audio signal are respectively coded as keeping them associated with each other. Next, each coded stream is multiplexed by a multiplexer in a manner to apply to a format of a transmission medium such as a storage medium or a network of the stream according to the application program. Then, the multiplexed data is transmitted or recorded.
In the flow of decoding in a receiver decoder, on the other hand, the received multiplexed stream is separated into the respective streams such as the video signal stream and the audio signal stream by a demultiplexer and then those separated streams are sent to a decoder. Next, each stream is decoded by a decoder and then is outputted to an output unit (video monitor or a speaker).
As mentioned above, the MPEG system is executed to time-divisionally multiplex plural coded streams into one stream and, on the receiving side, to synchronously decode the multiplexed stream into each stream as intended on the transmission side and reproduce the streams.
The MPEG system has a packet-based multiplexing system as a time-divisional multiplexing system. The packet-based multiplexing is a time-divisional transmitting system for dividing a video signal and an audio signal into streams of a fixed length called a packet, adding additional information such as a header to each packet, and switching the video packet and the audio packet into each other at a proper time in the case of multiplexing the video signal and the audio signal. The packet contains information for identifying an attribute of the signal, which indicates if the signal is a video signal or an audio signal, at the head portion called a header. In some cases, the packet may contain at its tail a CRC (Cyclic Redundancy Code) for detecting a bit error on the transmission.
The packet length strongly depends on the transmission medium and the application. The packet length may be short (53 bytes) as at ATM (Asynchronous Transfer Mode) or long (4096 bytes) like an optical disk system. In the MPEG, the upper length limit of the packet is about 216 (64 Kbytes) and the packet length may be fixed or variable for providing the packet with flexibility. Further, the MPEG allows a variable transmission speed so that the intermittent transmission of packets is made possible. The fixedly necessary portions such as the header do not depend on the packet length. Hence, if the packet is short, the overhead (additional data needed for multiplexing) is made so large that the transmission efficiency is dropped. However, the short packet needs only a short switching time for time-divisional multiplexing. Hence, the short packet has a merit that it reduces a delay caused by the multiplexing and a buffer memory in amount.
In the MPEG1 and the MPEG2-PS, the highest layer of the packet of the video signal or the audio signal is called a pack layer. Normally, a pack of bundling plural packets is a constitutional unit when the packets are treated. The pack header contains additional information for referring to a time reference for synchronous reproduction (to be discussed below). The main object of the pack is to provide a capability of decoding and reproducing the stream on its halfway point.
Herein, in the MPEG synchronous system, each decoding and reproducing unit that is referred to as an access unit of the video and the audio signals (the unit of the video signal is one frame and the unit of the audio signal is one audio frame) contains information called a time stamp for indicating when it is to be decoded and reproduced). The time stamp is given a time reference by the information called SCR (System Clock Reference).
The time stamp is a tag for managing a time in the decoding and reproducing process. The tag is added to each access unit. The time stamp is divided into two types of time stamps, one of which is referred to as a PTS (Presentation Time Stamp) and the other one of which is referred to as a DTS (Decoding Time Stamp). The PTS is the information for managing the time of reproduction and output. The DTS is the information for managing the time of decoding. If the head of the access unit is contained in a packet, these time stamps are added to the packet header. If no head of the access unit is contained in the packet, no time stamp is added to the packet header. Further, if two or more heads of the access units are contained in a packet, only the time stamp corresponding to the first access unit is added to the packet header.
As to the PTS, when the STC (System Time Clock) located inside of a reference decoder of the MPEG system coincides with the PTS, the access unit is reproduced and outputted. As to the DTS, the MPEG is arranged so that an I picture and a P picture are placed before a B picture when those pictures are sent out to the coded stream. Hence, the decoding sequence is different from the reproducing and outputting sequence. If the PTS is different from the DTS, both of the time stamps are added. If both coincide with each other, only the PTS is added.
The SCR (System Clock Reference) and the PCR (Program Clock Reference) are the information for setting and calibrating a STC (basic synchronous signal) value, that is, the time standard, into a value intended on the encoder side through the effect of the MPEG system decoder containing a video signal and an audio signal decoders. When using the SCR and the PCR, only the SCR and the PCR are not enough. Further, the timing accuracy (arrival time to the decoder) of the byte in the stream carrying the SCR and the PCR is required. In the MPEG2, the SCR or the PCR is composed of six bytes when they are sent. On the decoder side, on the instance of the arrival of the final byte, the STC is required to set a value indicated by the SCR or the PCR. The integration of the STC with the PLL (Phase-locked Loop) makes it possible to provide the decoder with the STC whose frequency completely coincides with the system clock of the decoder. In the MPEG2-TS (Transport Stream) (to be discussed below), this PLL function has to be given to the decoder
As mentioned above, the MPEG2 has a multiprogram-corresponding function that makes it possible to transmit plural programs. This function is a function of time-divisionally multiplexing lots of coded streams at a relatively short transmission unit called a transport packet. Only the MPEG2 has the foregoing multiprogram correspondence.
The stream of the MPEG2 has two kinds of multiplexing and separating systems for corresponding to the multiprogram, one of which is referred to as a PS (Program Stream) and the other one of which is referred to as a TS (Transport Stream). The transport packet contains the information for identifying the content of the packet data at the header portion. Based on the information, the packet required for reproducing the target program is picked out of a DMUX (demultiplexer) and then is decoded.
This transport packet is a relatively short packet with a fixed length of 188 bytes as a result of considering the connectivity with the ATM. The packet length of the ATM has the real data of 47 bytes. (One of 48 bytes of a payload (user information) of the ATM cell is used for synchronizing with the sequence.) One transport packet is allowed to be transmitted on four ATM packets (cells). The great difference between the transport stream (TS) and the program stream (PS) is as follows. The program stream (PS) is arranged to group plural packets (called the PES (Packetized Elementary Stream) in the MPEG2) and compose a pack, while the transport stream (TS) is arranged to re-divide the packet and transmit the packet on plural transport packets. Hence, the PES packet in the transport stream (TS) is served as a pack in the PS (and the MPEG1) and is expanded so that the similar information to the pack header may be transmitted in the PES packet.
The transport stream for corresponding to the multiprogram needs some kinds of information. Those pieces of information indicate which of the program is selected from plural programs, which of the packet is picked up, and how the packet is decoded, for the purpose of transmitting lots of video signal and audio signal streams. These pieces of program specification information are generally referred to as PSI (Program Specific Information). The PSI is transmitted on a packet having a specific identification code or a packet indicated by the primary PSI. The reference decoder for the transport stream (TS) provides a system buffer memory and a system decoder for the PSI processing. The PSI is described in detail in Program Specific Information of 2.4.4 of ISO/IEC13818-1.
Next, the data structure of the MPEG2-TS will be described below.
The data structure of the transport packet is analogous to the system of the ATM standardized in the ITU-T (previous CCITT) because of treating plural programs. FIG. 1 hierarchically illustrates the data structure of the transport packet, the meaning and the object of each information item will be described below. The transport stream syntax shown in FIG. 1 is specified by the ISO13818-1. Hence, the description thereabout is limited herein.
As shown in FIG. 1A, the transport stream is multiplexed and separated by the transport packet of a fixed length of 188 bytes. This transport packet consists of the header portion and the payload portion.
The header portion of the transport packet is structured as shown in FIG. 1B to FIG. 1D.
As shown in FIG. 1B, the transport packet includes a header composed of a synchronous byte portion, an error indicator portion, a unit start indicator portion, a transport packet priority portion, a PID portion, a scramble control portion, an adaptation and field control portion, a cyclic counter portion, and an adaptation and field portion.
A synchronous signal of 8 bits is positioned at the synchronous byte portion. The synchronous signal is used for the decoder to detect a head of the transport packet. One bit is positioned at the error indicator portion. This bit is used for indicating the presence or the absence of a bit error in this packet. Also, one bit is positioned at the unit start indicator portion. This bit is used for indicating that a new PES packet is started from the payload (effective packet data) of the transport packet. The transport packet priority portion is also composed of one bit for indicating the significance of this packet. The PID (Packet Identification) portion is composed of stream identification information of 13 bits for indicating an attribute of each stream of the packet. The scramble control portion is composed of two bits for indicating the absence or the presence and the type of a scramble of the payload of this packet. The adaptation field control portion is composed of two bits for indicating the absence or the presence of the adaptation field and the payload in this packet. The cyclic counter portion is composed of the information for detecting the packet with the same PID is partially discarded on the halfway. The four-bit cyclic counter information is detected on its continuity. The adaptation field portion may be inputted with the additional information about each stream or a stuffing byte (ineffective data byte) as an option. This makes it possible to transmit a dynamic state change of each stream.
As shown in FIG. 1C, the adaptation field portion is composed of an adaptation field length portion, a discontinuity indicator portion, a random access indicator portion, a stream priority indicator portion, five flags, an optional field portion, and a stuffing byte portion.
The adaptation field length portion is inputted with eight bits for indicating the length of the adaptation field portion. The discontinuity indicator portion is inputted with one bit for indicating that the system clock is reset to a new content in the next packet with the same PID. The random access display portion is inputted with one bit for indicating a sequence header of a video signal and a start of a frame of an audio signal. The stream priority indicator portion is inputted with one bit for indicating that the significant portion of each stream is located at the payload of this packet. For example, this corresponds to an intra-coded portion about the video signal. As shown in FIG. 1D, the optional field portion is composed of a PCR (Program Clock Reference) portion of 42 bits, an OPCR (Original PRC) portion of 42 bits, a splice and countdown portion of 8 bits, a transport private data length and data portion, and an adaptation field expansion portion. The splice and countdown portion is inputted with eight bits for indicating the number of transport packets with the same PID existing up to a splice point (SP). This function makes it possible to insert a CM (replace part of the stream) at a splice point on the transmission. The stuffing byte portion may be inputted with a stuffing byte of 8xc3x97M bits.
As shown in FIG. 1E, the optional field portion is composed of a lwt_valid_flag (legal time window_valid_flag) portion, a ltw_offset (legal time window_offset) portion, a piecewise rate portion, a splice type portion, and a DTS_next_au portion. The splice type portion is inputted with four bytes for indicating the specification of MP@ML (Main Profile at Main Level) at the MPEG2. The DTS_next_au portion is inputted with 33 bits for indicating a decoding time of the first access unit succeeding the splice point.
The decoding and reproduction of the transport stream are required to select one of plural programs and get to know the PIDs (normally, a plurality of PIDs for the video and the audio) of the transport packet of each stream required for decoding and reproducing the selected program. Next, the parameter information and the associating information of each stream are required. Hence, for doing many stepping operations, it is necessary to obtain several pieces of additional table information (PSI). These pieces of PSI are transmitted on the data structure called a section.
In this section, the special information to be transmitted in the packet with PID=0 is, for example, a program association table (PAT). This indicates the PID of the transport packet which transmits a table (program map table; PMT, a directory table of one program) having the program structure described therein at each program number (16 bits).
The program map table describes an identification number of the program and the PID list and the accessory information of the transport packet with which each stream such as a video signal stream or an audio stream composing the program is transmitted. The reason why the table is divided into the program association table and the program map table is that if all is described in only one table, the table is too large and needs too large memory for storing the table and a long time for accessing the program described at the tail of the table.
The section includes a conditional access table as an option. This table is not necessarily required but is an accessory table for an authorized user to decode and reproduce the scrambled stream for limiting the decoding and the reproduction.
By the way, the system for compressing and coding a moving picture like the foregoing MPEG2 is used for compressing and coding a broadcasting material in a broadcasting station (referred to as a main station) when the main station for supplying an image material such as a broadcasting program (referred to as a broadcasting material and a program material) operates to transmit the broadcasting material to each station composing the broadcasting network (referred to as a network station). As such, the compressed and coded stream transmitted from the main station to the network station is made to be the transport stream (TS).
When the network station receives the transport stream of the broadcasting material from the main station, the network station operates to insert its own material such as a CM image (simply referred to as a CM) to the transport stream of the broadcasting material and then retransmit or broadcast the resulting stream. The material to be inserted into the broadcasting material is an inserting material.
Herein, assuming that the splice of plural inserting materials is inserted to the transport stream of the broadcasting material, the inserting materials are compressed and coded in advance by the compressing and coding method like the MPEG2. If these inserting materials have different bit rates from one another, the following problem takes place.
That is, in the MPEG system, the coded bit stream has to meet the conditions required by a virtual buffer verifier called VBV (video buffering verifier). For example, consider that two inserting materials are spliced. If these inserting materials have the different bit rates from each other, a buffer occupancy control in coding the inserting material is pulled by the bit rate of the next inserting material spliced thereto.
In the MPEG system, at first, the buffer occupancy of the VBV is empty, and the VBV is filled with the data from the bit stream only for the time given by the vbv_delay located at the picture header of the MPEG syntax. The inserting materials have the different vbv_delay from each other. Hence, the random combination of the inserting materials is not made possible.
As a result of splicing the inserting materials, even if the buffer memories of the VBV are located continuously, it is not guaranteed that the presentation times are continuous at the splice point. At the splice point given when the spliced points are not continuous, it is presumed that the picture is frozen in later decoding.
Further, the bit occurrence amount of each picture in coding it cannot be exactly grasped until the picture is coded. For some patterns, hence, the presumption of the buffer control is not matched to the exact amount. It means that it is difficult for the inserting materials to reach the target buffer occupancy amount.
As mentioned above, in the case that the inserting materials are independently coded and the buffer limit to the picture of the end of the inserting material is not complete, if the inserting materials are randomly switched and combined on the MPEG stream, the overflow/underflow of the VBV buffer is brought about. Thus, the resulting material containing the spliced inserting materials does not meet the regulations of ISO13818-2 and Annex C of ISO11172-2. That is, the reproduction of the material is made impossible.
Hence, the inserting unit of each CM cannot be managed. Further, each material such as the CM inside of the inserting section of the material is coded at each combination of the materials and is managed by the material server for saving the inserting materials.
The status that the overflow or the underflow of the VBV buffer takes place will be described with reference to FIGS. 2 to 5.
FIG. 2 shows the case that the limit of the target buffer occupancy is not met when using the receiver decoder for accessing vbv_delay of the picture header at each time. That is, FIG. 2 shows the relation between the transport stream (TS) reaching at a constant rate and the VBV buffer and the relation between the input video data (picture sequence) reaching at regular sections. The inclination of lines for indicating the buffer occupancy shown in FIG. 2A represents a bit rate. The vertically lowered portion of the lines represents the bit amount pulled by the video decoder for reproducing each picture. The pulling timing means the presentation time. As will be understood from FIG. 2, the input video data is compressed to the bit amount according to the information amount of each picture and then is made to be the transport stream (TS) having a different number of packets. Further, FIG. 2(A) shows the change of the buffer occupancy of the VBV buffer on the receiver decoder side when three CMs (CM1, CM2, CM3) are spliced as the inserting materials. FIG. 2B shows the input sequence of the pictures on the encoder side for encoding the CM1, the CM2 and the CM3 and the transmitting sequence of the transport packets. In FIG. 2, I denotes an I picture (Intra-coded picture), P denotes a P picture (Predictive-coded picture), and B denotes a B picture (Bidirectionally predictive-coded picture). Further, SP denotes a splice point. tc denotes a target buffer occupancy originally required when the transport streams are connected at a splice point. ig denotes an input gap. io denotes an input overlap.
As is understood from FIG. 2, the receiver decoder for accessing vbv_delay of the picture header at each time has to wait for pulling of data from the buffer by vbv_delay shown in FIG. 2A. Hence, no breakup of the VBV buffer takes place.
However, at the period peA shown in FIG. 2A, the picture is frozen on the receiver decoder, so that the disorder of the display synchronicity takes place. At the period peB shown in FIG. 2A, the display section is made so short that the picture is broken by exceeding the decoder processing speed or the display synchronicity is disordered.
FIG. 3 shows the case that the limit of the target buffer occupancy is not met in the case of using the receiver decoder that does not access vbv_delay of the picture header at each time. That is, FIG. 3 shows the relation between the transport stream (TS) reaching at a constant rate and the VBV buffer and the relation between the input video data (picture sequence) reaching at regular sections and the transport stream (TS). Like the case of FIG. 2, in FIG. 3, the input video data is compressed into a bit amount according to the information amount of each picture and is made to be a transport stream (TS) having a different number of packets. The inclination of the lines for representing the buffer occupancy shown in FIG. 3 represents a bit rate. The vertically lowered portion of the lines represents the amount of bits pulled by the video decoder for reproducing each picture. The receiver decoder shown in FIG. 3 operates to access vbv_delay of the picture header when there exists a sequence_start_code specified by the MPEG. In FIGS. 3A and 3C, when two CMs (CM1 and CM2) are spliced as the inserting materials, the change of the buffer occupancy of the VBV buffer is shown on the receiver decoder side. FIGS. 3B and 3D show the input sequence of the pictures and the transmission sequence of the transport packets on the encoder side for coding each picture of the CM1 or the CM2. FIGS. 3A and 3B show the case that the underflow of the VBV buffer takes place. FIGS. 3C and 3D show the case that the overflow of the VBV buffer takes place. In these figures, I denotes an I picture. P denotes a P picture. B denotes a B picture. SP denotes a splice point. tc denotes a target buffer occupancy originally required when the transport streams are linked at the splice point. ig denotes an input gap. io denotes an input overlap.
As is understood from FIG. 3, the receiver decoder that does not access vbv_delay of the picture header at each time operates to pull the data from the VBV buffer at vbv_delay only when at the initializing state (only when a sequence start code exists). In FIG. 3A, at a point poA, the underflow takes place, so that the VBV buffer is broken up. In FIG. 3C, at a point poB, the overflow takes place, so that the VBV buffer is broken up.
FIG. 4 shows the case that the materials having the different bit rates from each other are spliced in the case of using the receiver decoder for accessing vbv_delay of the picture header at each time. That is, FIG. 4 shows the relation between the transport stream (TS) reaching at the corresponding rate to each material and the VBV buffer and the relation between the input video data (picture sequence) reaching at the corresponding sections to each material and the transport steam (TS). FIGS. 4A and 4C show the change of the buffer occupancy of the VBV buffer on the receiver decoder side when the inserting materials having the different bit rates from each other are spliced with each other. FIGS. 4B and 4D shows the input sequence of the pictures and the transmission sequence of the transport packets on the encoder side for coding each picture of these inserting materials. FIGS. 4A and 4B concern with the case that the bit rate of the spliced materials goes down. FIGS. 4C and 4D concern with the case that the bit rate of the spliced materials goes up. Like the case of FIG. 2, in FIG. 4, the inclination of the lines for representing the buffer occupancy represents a bit rate. The vertically lowered portion of the lines represents the amount of bits pulled by the video decoder for reproducing each picture. In FIG. 4, I denotes an I picture. P denotes a P picture. B denotes a B picture. Sp denotes a splice point. st denotes a point at which the packet is stuffed.
In FIGS. 4A and 4B, the breakup of the VBV buffer does not take place. However, the presentation time is made discontinuous. In FIGS. 4C and 4D, the overflow of the VBV buffer takes place.
FIG. 5 shows the case that the materials having the different bit rates from each other are spliced with each other in the case of using the receiver decoder that does not access vbv_delay of the picture header at each time. That is, FIG. 5 shows the relation between the transport stream (TS) reaching at the corresponding rate to each material and the VBV buffer and the relation between the input video data (picture sequence) reaching at the corresponding sections to each material and the transport stream (TS). The receiver decoder shown in FIG. 5 operates to access vbv_delay of the picture header when there exists a sequence start code regulated by the MPEG. FIGS. 5A and 5C show the change of the buffer occupancy of the VBV buffer on the receiver decoder when the inserting materials having the different bit rates from each other are spliced with each other. FIGS. 5B and 5D show the input sequence of the pictures and the transmission sequence of the transport packets on the encoder for coding each picture of the inserting materials. FIGS. 5A and 5B concern with the case that the bit rate of the spliced materials goes down. FIGS. 5C and 5D concern with the case that the bit rate of the spliced materials goes up. Like the case of FIG. 2, in FIG. 5, the inclination of the lines for representing the buffer occupancy represents the bit rate. The vertically lowered portion of the lines represents the amount of bits pulled by the video decoder for reproducing each picture. In FIG. 5, I denotes an I picture. P denotes a P picture. B denotes a B picture. SP denotes a splice point. st denotes a point at which the packet is stuffed.
As is understood from FIG. 5, the receiver decoder that does not access vbv_delay of the picture header at each time operates to pull out data from the VBV buffer at vbv_delay only at the initializing state (only when there exists a sequence start code). In the case shown in FIGS. 5A and 5B, the underflow takes place, so that the VBV buffer is broken up. In the case shown in FIGS. 5C and 5D, the overflow takes place, so that the VBV buffer is broken up.
The present invention is made by considering the foregoing problems. It is an object of the present invention to provide a method and an apparatus for supplying an image material which method and apparatus are arranged to prevent the VBV buffer from being broken, keep the spliced point continuous, prevent the picture from being frozen in decoding even if the different inserting materials are spliced with each other and to randomly combine plural inserting materials with each other.
According to an asepct of the invention, a method for supplying an image material, taking the steps of compressing and coding the image material, generating a coded bit stream meeting a condition requested by a virtual buffer verifier, and adding information of a splice point when splicing the coded bit stream, includes the steps of: compressing and coding the image material; compressing and coding the same image material as the image material compressed and coded at the first step; controlling a bit rate of a coded bit stream composed by compressing and coding the image material, for the second compressing and coding step, based on the information about an occurrence amount of bits derived as a result of compressing and coding at the first compressing and coding step and controlling generation of the coded bit stream so that the virtual buffer verifier is made to have a target buffer occupancy at the splice point.
According to another aspect of the invention, an apparatus for supplying an image material, for supplying a bit stream of a specific transmission format composed by compressing and coding the image material, includes: means for describing an insertion point for indicating a location where the image material is to be inserted and information about an inserting material for indicating an image material to be inserted in a section indicated by the insertion point on the bit stream of the specific transmission format.
According to another aspect of the invention, a method for supplying an image material, for supplying a bit stream of a specific transmission format composed by compressing and coding the image material, includes the step of: describing an insertion point for indicating a location where the image material is to be inserted and information about an inserting material for indicating an image material to be inserted into a section indicated by the insertion point on the bit stream of the specific transmission format.
According to another aspect of the invention, a method for inserting an image material, for inserting another image material to a bit stream of a specific transmission format transmitted in the state of compressing and coding an image material, includes the steps: detecting an insertion point and information about an insertion material from the bit stream of the specific transmission format composed by describing at least the insertion point for indicating a location where the image material is to be inserted and the information about the inserting material for indicating an image material to be inserted into a section indicated by the insertion point; storing an inserting material composed of other image materials; and taking out an inserting material corresponding to the information about the inserting material from the stored inserting materials and inserting the inserting material into the section indicated by the insertion point of the bit stream of the specific transmission format.
Further objects and advantages of the present invention will be apparent from the following description of the preferred embodiments of the invention as illustrated in the accompanying drawings.