(1) Field of the Invention
The present invention relates to a system stream creating apparatus for creating a system stream from a video stream and an audio stream which have been generated in accordance with the MPEG standard. More particularly, the present invention relates to a system stream creating apparatus for creating a system stream which easily conforms to the MPEG standard and the DVD standard and to a DVD recorder having the system stream creating apparatus.
(2) Description of Related Art
Recently, DVD-RAM, a phase-change type optical disc having a capacity of several giga bytes, has come on the market. It is expected that the DVD-RAM will be used as a recording/reproducing medium not only for computers but for other commercial products. This expectation has been enhanced as MPEG (MPEG2), a standard for encoding the digital audiovisual (hereinafter referred to as AV) data, has become commercially practical.
MPEG
The AV data recorded on the DVD-RAM conforms to an international standard called MPEG (ISO/IEC13818). The capacity of the DVD-RAM, though several gigabytes, is not enough to record uncompressed digital AV data. The AV data is therefore recorded after it is compressed. The MPEG standard is prevalent as a method for compressing AV data. Thanks to the recent progress in the LSI circuit technology, MPEG codec (compression/decompression LSI) has come into practical use. This has made it possible for DVD recorders to compress/decompress digital data in accordance with the MPEG standard.
MPEG has the following two main characteristics for achieving highly efficient data compression.
The first characteristic is that MPEG compresses moving-image data using the time correlation characteristic found between frames as well as the spacial frequency characteristic which has been used conventionally. For data compression in MPEG, each frame (in MPEG, also referred to as Picture) is classified into I-Picture (Intra-Coded Picture), P-Picture (Predictive-Coded Picture which uses the I-Picture and reference to past), and B-Picture (Bidirectionally Predictive-Coded Picture which uses the I-Picture and reference to past and future).
To achieve trick plays such as fast-forward, rewinding, and a reproduction from any desired point when reproducing data stored in a storage medium, MPEG defines GOP (Group Of Pictures). This is because in MPEG, frames do not complete in themselves and as described above, video data is encoded based on prediction using past and future frames. As a result, frames are divided into groups of Pictures, the groups being called GOPs which each include at least one I-Picture. With such a construction, random access is available.
The second characteristic of MPEG is that the amount of coding is assigned dynamically in units of Pictures in proportionate to the complexity of images. In MPEG, the decoder includes an input buffer in which data is stored beforehand. This construction enables complicated images to be assigned a great amount of coding.
The audio data for DVD-RAM is compressed with one method: MPEG audio or Dolby digital (AC-3) for compressing data, or LPCM not for compressing data. The Dolby digital and the LPCM use fixed bit rates. The MPEG audio uses a variable bit rate in which audio frames are generated in fixed synchronization, with different sizes.
The AV data is multiplexed into one stream with the xe2x80x9cMPEG systemxe2x80x9d method. FIG. 1 shows the construction of the MPEG system. In the drawing, xe2x80x9c21xe2x80x9d represents a pack header, xe2x80x9c22xe2x80x9d a packet header, and xe2x80x9c23xe2x80x9d a payload. The MPEG system has a hierarchical structure including packs and packets. Each packet is composed of a packet header 22 and a payload 23. The AV data is divided into portions of an appropriate size from the start of the AV data. The payload 23 stores a piece of divided data. The packet header 22 includes a stream ID, DTS (Decoding Time Stamp), and PTS (Presentation Time Stamp) as information of the AV data stored in the payload 23. The stream ID is used to identify the AV data stored in the payload 23. The DTS indicates a time when the AV data stored in the payload 23 is decoded and is represented with accuracy of 90 kHz. It should be noted here that the packet header 22 does not include DTS when, for example, audio data is decoded and presented at the same time. The pack is a unit including a plurality of packets. In DVD-RAM, one packet is used as one pack. Therefore, each pack is composed of a pack header 21 and a packet (composed of a packet header 22 and a payload 23). In the pack header, SCR (System Clock Reference) is recorded. The SCR indicates a time when the AV data stored in the pack is input into the decoder buffer, with accuracy of 27 MHz. For DVD-RAM, the types of the packets are determined in accordance with the type of the stream the DVD-RAM stores. The packet types are, for example, the video stream packet for storing MPEG video data, the audio stream packet for storing MPEG audio data, the private stream 2 packet for storing Dolby AC-3 audio data, and the padding stream packet for storing dummy data which is discarded by the demultiplexer during decoding.
The DVD-ROM records such an MPEG system stream so that one pack has one sector (=2,048 bytes).
Now, the decoder for decoding the above MPEG system stream will be described. FIG. 2 shows a decoder model (P-STD) for the MPEG system decoder. The decoder includes STC (System Time Clock) 31 which clocks the standard time in the decoder, a demultiplexer 32 which decodes and demultiplexes the system stream, a video buffer 33 for a video decoder, the video decoder 34, a re-order buffer 35 which temporarily stores the I- and P-Pictures for the purpose of rearranging the I-, P-, and B-Pictures for presentation, a switch 36 which adjusts the output order of the I-, P-, and B-Pictures stored in the reorder buffer, an audio buffer 37 for an audio decoder, and the audio decoder 38.
The MPEG system decoder with the above construction processes the MPEG system stream as follows. When the time of the STC 31 matches the SCR written in the pack header in a pack, the demultiplexer 32 inputs the pack. The demultiplexer 32 decodes the stream ID of the packet header, and transfers the data in the payload to the decode buffer corresponding to the stream. The demultiplexer 32 also extracts the PTS and. DTS from the packet header. The video decoder 34 extracts Picture data from the video buffer 33 when the time of the STC 31 matches the DTS, decodes the Picture data, stores the I- and P- Pictures in the re-order buffer 35, and outputs the B-Pictures for presentation. The switch 36 is positioned on the side of the re-order buffer 35 when the video decoder 34 decodes an I- or P-Picture, and on the side of the video decoder 34 when the video decoder 34 decodes a B-Picture. The audio decoder 38 extracts data of one audio frame from the audio buffer 37 when the time of the STC 31 matches the PTS (for audio data, there is no DTS) and decodes the extracted data.
For MPEG, xe2x80x9c0x00xe2x80x9d (in this document, xe2x80x9c0xxe2x80x9d indicates that the succeeding numerals represent a hexadecimal value) has special meaning. Every meaningful group of data in MPEG begins with a 4-byte identification code. For example, the pack header begins with a 4-byte code, xe2x80x9c0x000001BAxe2x80x9d, the GOP xe2x80x9c0x000001B8xe2x80x9d, and the Picture xe2x80x9c0x00000100xe2x80x9d. xe2x80x9c0x00xe2x80x9d is referred to as xe2x80x9cnext start codexe2x80x9d since a sequence of two xe2x80x9cnext start codesxe2x80x9d and one xe2x80x9c0x01xe2x80x9d indicates the start of a meaningful group of data. In MPEG, there is no limit to the number of successive xe2x80x9cnext start codesxe2x80x9d, but when xe2x80x9c0x01xe2x80x9d is found, the position two next start codes before the xe2x80x9c0x01xe2x80x9d is recognized as the start of the meaningful group of data. The xe2x80x9cnext start codesxe2x80x9d before these codes are skipped by the decoder during reproduction.
Now, a method of multiplexing data into the MPEG system stream will be described with reference to FIGS. 3A to 3D. FIG. 3A shows a video frame. FIG 3B shows a video buffer. FIG. 3C shows an MPEG system stream. FIG. 3D shows audio data. The horizontal axis shows a time axis which is common to the drawings. The vertical axis in FIG. 3B indicates amount of buffer occupation (amount of data stored in the video buffer). The thick solid line shows the change in the amount of buffer occupation over time. The amount of tilt of the thick solid line is proportionate to the video bit rate. The line shows that data is input to the buffer at a certain rate. The reduction in the amount of buffer occupation happenning at intervals shows that data has been decoded. The points of intersections of slant break lines and the time axis indicate the data transfer start times when video frames start to be transferred to the video buffer.
The following is an explanation using a complicated image A as an example. As shown in FIG. 3B, since the image A requires a great amount of encoding, the data starts to be transferred to the video buffer at time t1 before the decoding time. The period between the data input start time t1 and the decoding time is referred to as xe2x80x9cvbv_delayxe2x80x9d. According to the standard for DVD-RAM, to ensure the normal operation of the decoder during reproduction, the amount of Pictures generated by the video encoder and timing with which the system encoder multiplexes should be controlled so that the change of the amount of data in the video buffer shown in FIGS. 3A to 3D ranges 0 to 224 KB. The audio data needs not be transferred as earlier as the video data since it does not require such a dynamic control of the amount of encoding. As a result, it is typical that the audio data is multiplexed a little earlier than the decoding time. Accordingly, among the video data and the audio data to be presented at the same time, the video data starts to be multiplexed earlier than the audio data. In MPEG, a time period during which data is stored in the buffer is defined. According to the definition, all data except for still picture data should be output from the buffer to the decoder in one second after the data is input to the buffer (this definition is called xe2x80x9cone-second rulexe2x80x9d). As a result, the difference between the video data and the audio data at multiplexing is one second at most (strictly speaking, the difference may be larger than this when the difference with the reorder buffer for the video data is added).
The basic idea of causing the system encoder to store the video data into the packs and inserting the SCRs will be described with reference to FIGS. 4A and 4B. As shown in FIG. 4B, the system stream is composed of a plurality of packs 510. SCR is written in each pack header 511. The system stream has a predetermined value called multiplexing rate Mx. This indicates that the pack 510 is input to the demultiplexor 509 of the decoder at the bit rate of Mx. The multiplexing rate Mx corresponds to the transfer speed on the belt conveyor 501 in the example shown in FIG. 4A. Similarly, the packs correpond to the boxes on the belt conveyor 501, and the video data to the load 503 packed in the box 502. The system encoder 504 adjusts the amount of the video data (=load 503) to be packed in the box 502 and also adjusts the timing with which the box 502 is put on the belt conveyor 501, based on the amount of video data generated by the video encoder 505. The timing adjustment corresponds to the decision of the SCR value. This is because the demultiplexor 509 takes out the load 503 (=video data) from the box 502 the instant STC matches SCR after the box reaches the decoder 506. The extracted video data is temporarily stored in the video buffer 507. When the video encoder 505 generates a large number of Pictures, the boxes 502 having the loads 503 are sequentially put on the belt conveyor 501 with little spaces in between. In contrast, when the video encoder 505 generates a small number of Pictures, a small number of boxes 502 are put on the belt conveyor, or in some cases, a plurality of loads (=frames) are packed in each box. Also, a cushioning material may be packed in the box to fill in the space. The cushioning material corresponds to dummy data. The video data stored in the video buffer 507 is decoded based on DTS written in the packet header 512. As a result, data stored in the video buffer 507 reduces by the Picture size. The basics of the video multiplexing performed by the system encoder is to determine the values of SCR (=the timing with which boxes 502 are transmitted), the amount of video data (=load 503 to be packed in the box), and the amount of dummy data (=cushioning material) so that the amount of data stored in the video buffer 507 does not exceed an upper limit or so that the video buffer 507 becomes empty during a time period between an arrival and a consumption of data, and so that the aforesaid one-second rule is not violated.
Now, the logical construction of DVD-RAM will be described with reference to FIG. 5. The DVD recorder deals with two major files: one management information file; and one or more AV files.
The contents of the management information file will be described with reference to FIG. 6A, using mainly the management information file for video.
The management information file includes two major tables: VOBI (VOB Information) table; and PSGI (PSG Information) table. The VOB is an MPEG program stream. The PSG defines the presentation order of xe2x80x9ccellsxe2x80x9d for which an arbitrary portion or all portions of a VOB are a logical presentation unit. In other words, VOB is a unit of MPEG data, and PSG is a unit used when a player performs presentation.
As shown in FIG. 6A, the VOBI table records the number_of_VOBIs and VOBIs. Each VOBI includes VOB_Type (type of VOB), VOB_Start_PTM (presentation start time), VOB_End_PTM (presentation end time), VOB_REC_PTM (information on the time when the start of the VOB is recorded), and TMAPIs (time map information of VOSUs constituting the VOB).
Accesses to AV files will be described with reference to FIG. 6B. Each AV file is composed of one or more VOBs. The VOBs are consecutively recorded in the AV file. The VOBs in the AV files are managed by the management information files. To access a VOB, the player first accesses the management information file to read the VOB start address and the size. This enables the player to accesses the VOB. Each VOB is composed of a plurality of VOBUs. The VOBU is, as shown in FIG. 7, a unit of data which is composed of: (1) one or more GOPs of MPEG video data multiplexed as MPEG streams; and (2) a plurality of audio packs interleaved with the GOPs. The presentation time for each VOBU should not go out of a predetermined range. The encoder should generate VOBUs taking care of this. Also, a piece of data belonging to a VOBU should not be included in another VOBU. For example, data of a GOP included in a VOBU should completely be included in the VOBU. FIG. 8A shows an example in which data of a GOP belonging to a VOBU is stored in the last pack of the current VOBU and the first pack of the next VOBU, passing over the boundary between the two VOBUs The violation of the boundary between VOBUs such as this is not permitted.
Meanwhile, to make the most of the mass-storage optical disc DVD-RAM expected to be a next-generation AV record medium, the following problems should be solved. The present invention provides a DVD recorder which solves the problems and is used to record digital data onto DVD-RAM and reproduce the digital data recorded on DVD-RAM.
The DVD recorder is expected to be a next-generation AV record apparatus and be a commercial recorder that will replace the currently prevalent VTR. However, to replace the VTR, the DVD recorder needs to achieve higher-quality images and higher-level editing functions than the VTR.
With regards to the high-quality images, the variable bit rate technique for MPEG video is useful. In the variable bit rate technique, a greater number of Pictures are assigned to a frame whose image is more complicated and moves more, and a smaller number of Pictures are assigned to a frame whose image is less complicated and moves less. This technique causes successive Pictures to greatly change in the amount of data. Also, the high-quality editing functions are achieved by: (1) the random access function; and (2) the data retrieval function using the management file information, both being characteristics of DVD-RAM.
When these techniques are applied to the data structure of the AV file for the DVD-RAM described earlier, the following problems occur especially when VOBs are generated in real time using a real time encoder.
The variable bit-rate technique for MPEG video assigns a smaller number of Pictures to a frame whose image is less complicated and moves less. In doing this, the amount of data included in the Pictures is less than the payload 23 in the pack, a plurality of frames are stored in the payload 23. Meanwhile, as shown in FIG. 9B, each frame is decoded every 33.3667 msec during presentation. Here, suppose that five frames of Picture data is stored in the payload 1203 in the pack 1201 as shown in FIG. 9A. Then, the Picture data in xe2x80x9cfrm5xe2x80x9d having reached the decoder waits for at least 100.1001 msec (during which xe2x80x9cfrm2xe2x80x9d to xe2x80x9cfrm4xe2x80x9d are decoded) in the video buffer 33 before it starts to be decoded. Here, if there was no limit to the number of frames included in one pack, and 32 or more frames of Pictures were stored in a pack, the Picture in the 32nd frame would wait in the decoder buffer for more than one second before it is decoded for the same reason as the example shown in FIGS. 9A and 9B. This violates the MPEG rule, and may cause the decoder to malfunction during presentation.
As shown in FIG. 8B, in DVD-RAM, data in a GOP belonging to a VOBU should not be included in another VOBU. That is to say, the GOP must be completely included in the VOBU. In other words, as shown in FIG. 8A, the pack including the last data of a GOP should not include the first data of the next GOP belonging to the next VOBU. However, when the number of Pictures in frames greatly changes depending on the complexity of the images when the variable bit rate technique is used, it is difficult for the video encoder to adjust in real time the number of generated Pictures to match the size of the payload 23 included in the pack.
Also, as shown in FIG. 6B, in DVD-RAM, the VOBU time map is referred to as basic information for accessing AV files. Also, video editing is performed in units of VOBUs. Suppose the audio data xe2x80x9caxe2x80x9d and the audio data xe2x80x9cbxe2x80x9d shown in FIG. 8A are composed of audio frames that have the same presentation time. That is to say, suppose that a1 and b1, a2 and b2, and a3 and b3 have the same presentation time, respectively. Here, since a3 is not adjacent to b3, they happen to be arranged in different but successive VOBUs, sandwiching a video pack and the boundary between the VOBUs. If VOBU#1 was deleted by editing now, only b3 would remain. When only one of two pieces of audio data having the same presentation time is deleted or remains, the two pieces of audio data may cause a difference in the sounds when presented after editing. When a piece of data belonging to a VOBU is included in another VOBU, such an improper operation is caused after editing since the split pieces of data should be presented at the same time as the contents define.
It is therefore an object of the present invention to provide a DVD recorder which easily and surely conforms to the one-second rule and other rules relating to VOBU.
The above object is fulfilled by a system stream creating apparatus for creating a system stream, the system stream being a sequence of fixed-length packs, each pack storing a piece of video stream data, the video stream data being a sequence of picture data, the system stream creating apparatus comprising: a stream data transfer unit operable to extract a piece of picture data having a size of a payload from the video stream data and store the piece of picture data into a fixed-length pack; a header data generating unit operable to write a specified time in a header of the pack storing the piece of picture data, the specified time indicating a time when the piece of picture data of the pack is to be input to a video decoder buffer of a decoding apparatus; a condition judging unit operable to judge, when the header data generating unit writes the specified time, whether a difference between (1) a total number of pieces of picture data to be stored in the video decoder buffer up to the specified time and (2) a total number of pieces of picture data to be decoded by the decoding apparatus up to a unit time before the specified time has reached a predetermined value; a time updating unit operable to update the specified time; and a stop/resume control unit operable to, when the condition judging unit judges that the difference has reached the predetermined value, cause the header data generating unit not to write the specified time and cause the stream data transfer unit to stop storing the piece of video stream data, and when having caused the header data generating unit not to write and having caused the stream data transfer unit to stop storing, cause the time updating unit to update the specified time and cause the condition judging unit to judge whether the difference calculated using the updated specified time has reached the predetermined value, and when the condition judging unit makes the judgement negatively, cause the header data generating unit to write the specified time and cause the stream data transfer unit to resume storing the piece of video stream data.
With the above construction, the number of pictures stored in the video buffer of the decoder during a certain time period can be limited. This makes it possible to easily and surely conform to the one-second rule even if the data is encoded with a variable bit rate or even if data is encoded and recorded in real time.
In the above system stream creating apparatus, the decoding apparatus may decode one piece of picture data every video frame cycle, the unit time may be one second, and the predetermined value may be lower than a result of a division of one second by one video frame cycle.
With the above construction, the number of pictures included in the packs input to the decoding apparatus during one second is limitted to a number lower than 30. This makes it possible to easily and surely conform to the one-second rule.
The above object is also fulfilled by a recorder system comprising: the above system stream creating apparatus; and a recording apparatus which records a system stream generated by the system stream creating apparatus onto a record medium.
The above construction achieves a recorder system for generating a system stream which easily and surely conform to the one-second rule.
The above recorder system may further comprise: a reading apparatus which reads the system stream from the record medium; and a decoding apparatus which decodes the system stream read by the reading apparatus.
The above construction achieves a recorder system for generating a system strream which easily and surely conform to the one-second rule and decoding the generated system stream.
The above object is also fulfilled by a system stream creating apparatus for creating a system stream, the system stream being a sequence of fixed-length packs, each pack storing a piece of video stream data, the video stream data being a sequence of picture data, the system stream creating apparatus comprising: a stream data transfer unit operable to extract a piece of picture data having a size of a payload from the video stream data stored in a video buffer and store the piece of picture data into a fixed-length pack; a header data generating unit operable to write a specified time in a header of the pack storing the piece of picture data, the specified time indicating a time when the piece of picture data of the pack is to be input to a video decoder buffer of a decoding apparatus; a condition judging unit operable to judge, when a piece of picture data having a size of payload is extracted from the video buffer and stored into a fixed-length pack, whether an amount of data stored in the video buffer would be lower than or equal to a predetermined value if the piece of picture data having the size of payload were stored into the video buffer, using a model of change in the amount of data stored in the video buffer, the model being made on an assumption that picture data is input to the video buffer every certain time and a piece of picture data included in each pack is output from the video buffer at a specified time written in a header of each pack; a time updating unit operable to update the specified time; and a stop/resume control unit operable to, when the condition judging unit judges that the amount of data would be lower than or equal to the predetermined value, cause the header data generating unit not to write the specified time and cause the stream data transfer unit to stop storing the piece of picture data, and when having caused the header data generating unit not to write and having caused the stream data transfer unit to stop storing, cause the time updating unit to update the specified time and cause the condition judging unit to judge whether the amount of data stored in the video buffer would be lower than or equal to the predetermined value, and when the condition judging unit judges that the amount of data would exceed the predetermined value, cause the header data generating unit to write the specified time and cause the stream data transfer unit to resume storing the piece of picture data.
With the above construction, even if a pack stores in advance picture data having been expected to be stored a certain time later, a time the picture data should be input to the decoding apparatus is delayed. This easily prevents a number of pictures being stored in the video buffer of the decoder to break the one-second rule.
In the above system stream creating apparatus, the certain time may be a video frame cycle or a slice cycle.
With the above construction, the pictures are generated at the same intervals as the pictures are decoded. This enables the pictures to be stored in the video buffer of the decoder at shorter intervals than the pictures are decoded, easily preventing the one-second rule from being broken.
The above system stream creating apparatus may further comprise: a picture number judging unit operable to judge, when the stream data transfer unit stores a next piece of picture data into the pack, whether a total number of pieces of picture data in the pack has reached a predetermined number; and a transfer control unit operable to, when the picture number judging unit has judged positively, cause the stream data transfer unit to stop storing the next piece of picture data and store dummy data into the pack.
With the above construction, the number of stored pictures is limited in units of packs. This easily prevents the one-second rule from being broken due to over-storage of pictures in the video buffer of the decoder.
In the above system stream creating apparatus, the transfer control unit may cause the stream data transfer unit to store the next piece of picture data into another pack.
With the above construction, a picture not having been stored in a pack due to a limit to the number of stored pictures can be stored in a newly created pack.
The above system stream creating apparatus may further comprise: a video encoding unit operable to generate picture data by compressing a video signal when the picture number judging unit has judged negatively, and generating as many next start codes as correspond to remaining space of the pack as the dummy data when the picture number judging unit has judged positively, wherein the stream data transfer unit stores either the picture data or the next start codes generated by the video encoding unit into the pack.
With the above construction, the length of the packs can be adjusted to a fixed length by using the next start codes which are not treated as meaningful data by the decoder. The next start codes are stored in the video buffer when the packs are decoded. That is to say, the next start codes are generated by the video encoder, not by the system encoder. This enables the video encoder, which mainly generates the picture data, to accurately recognize and manage the occupied amount of the video buffer.
The above object is also fulfilled by a system stream creating apparatus for creating a system stream, the system stream being a sequence of fixed-length packs the system stream creating apparatus comprising: a video encoding unit operable to generate picture data and when having generated a last piece of picture data of a GOP, generate as many next start codes as correspond to remaining space of a pack which stores the last piece of picture data; and a stream data transfer unit operable to store either the picture data or the next start codes generated by the video encoding unit into a fixed-length pack.
With the above construction, the next start codes are stored in the pack that stores the last picture of a GOP. This easily prevents a picture of a GOP belonging to the next VOBU from being inserted into a pack of the current VOBU by mistake.
The above object is also fulfilled by a system stream creating apparatus for creating a system stream, the system stream being a sequence of fixed-length packs, each pack storing a piece of either video stream data or audio stream data, the video stream data being a sequence of picture data, the audio stream data being a sequence of audio frames, the system stream creating apparatus comprising: a stream data transfer unit operable to extract either a piece of picture data having a size of a payload from the video stream data or an audio frame from the audio stream data and store the extracted picture data or audio frame into a fixed-length pack; and a transfer control unit operable to control the stream data transfer unit so that a group of audio frames provided through a plurality of channels and having the same presentation time in common are stored in a group of packs which have been generated successively.
The above construction makes it easy to store the audio frames belonging to different channels and having the same PTS in common into the packs that are arranged successively in a system stream.
The above system stream creating apparatus may further comprise: a header data generating unit operable to write a specified time into a header of a pack, the specified time indicating a time when either a piece of picture data or an audio frame included in the pack is to be input to a decoding apparatus, wherein when a difference between a presentation time of the audio frame and the specified time written in the header of the pack is lower than a certain value, the transfer control unit causes the stream data transfer unit to store the audio frame into the pack.
With the above construction, when there are two audio frames belonging to different channels and having the same PTS, one of them is first packed since the difference between PTS and SCR of the audio frame is lower than a predetermined value, and the other audio frame is packed in the next packing since the difference between PTS and SCR of the other audio frame is also lower than a predetermined value without fail. As a result, the two packs storing the two audio frames are arranged successively in the system stream.
The above object is also fulfilled by a system stream creating method for creating a system stream, the system stream being a sequence of fixed-length packs, each pack storing a piece of video stream data, the video stream data being a sequence of picture data, the system stream creating method comprising: a stream data transfer step for extracting a piece of picture data having a size of a payload from the video stream data and storing the piece of picture data into a fixed-length pack; a condition judging step for judging, when a specified time, which indicates a time when the piece of picture data stored in a pack is to be input to a video decoder buffer of a decoding apparatus, is written in a header of the pack storing the piece of picture data, whether a difference between (1) a total number of pieces of picture data to be stored in the video decoder buffer up to the specified time and (2) a total number of pieces of picture data to be decoded by the decoding apparatus up to a unit time before the specified time has reached a predetermined value; a specified time writing step for writing the specified time into the pack storing the piece of picture data when it is judged in the condition judging step that the difference has not reached the predetermined value; and a specified time adjusting step for, when it is judged in the condition judging step that the difference has reached the predetermined value, updating the specified time, judging whether a difference between (1) a total number of pieces of picture data to be stored in the video decoder buffer up to the updated specified time and (2) a total number of pieces of picture data to be decoded by the decoding apparatus up to a unit time before the updated specified time has reached the predetermined value, and when the judgement is made negatively, writing the updated specified time into the pack storing the piece of picture data.
With the above construction, the number of pictures stored in the video buffer of the decoder can be limited a certain time before the storage. This makes it possible to easily and surely conform to the one-second rule even if the data is encoded with a variable bit rate or even if data is encoded and recorded in real time.