(1) Field of the Invention
The present invention relates to an encoding/recording device and an encoding/recording method for compressing and encoding video data based on time correlation properties of the video data, multiplexing the video data with audio data, and recording the multiplexed video data and audio data. More specifically, the present invention relates to a technology for enhancing a function to suspend recording by the encoding/recording device.
(2) Description of the Prior Art
In recent years, increasing amount of information is digitized. Especially, more and more sound and images are digitized since digital information suffers no degradation with the passage of time and is relatively easy to be processed. Hereafter, sound and images in digital format is collectively called xe2x80x9cAV (audio-video) dataxe2x80x9d.
MPEG (Moving Picture Expert Group, including MPEG-2 in this specification) is an international standard used to compress AV data to record it effectively.
MPEG for video data uses a compression method based on time correlation properties between different pictures in addition to a conventionally used compression method based on discrete cosine transform (DCT). The compression method based on the time correlation properties achieves a high compression rate by representing one picture as differential data between the picture and other similar pictures to this picture which are reproduced before and after this picture. However, with this MPEG based on the time correlation properties, presenting order and decoding order for pictures are different, and therefore it is necessary to record, or decode and store a picture that is referred to for encoding another picture, prior to the other picture that refers to this picture. For MPEG, a picture that is referred to for encoding of another picture is called an I picture (intraframe predictively encoded picture) or a P picture (interframe predictively encoded picture). A picture that is encoded referring to another picture (I or P picture) is called a B picture (bi-directionally predictively encoded picture).
Video data is a plurality of sets of still image data per unit time (each set of still image data is hereafter called a video frame), and therefore video data usually contains similar images. As MPEG can provide a higher compression rate for video data containing more images similar to one another, MPEG is suitable for compression of video data.
MPEG can effectively compress data by providing a different compression rate for each image and dynamically assigning encoding bits to the image in accordance with its complexity.
Audio data, on the other hand, has a smaller size than video data, and therefore a different compression method than used for the video data is usually used.
For instance, a DVD recorder that records AV data onto a DVD-RAM according to MPEG allows a user to select whether to compress audio data. When selecting that the AV data should be compressed, the user can further select whether MPEG Audio or Dolby AC3 should be used as a compression method. The DVD recorder then encodes the audio data using the selected compression method. When the user selects that no compression is performed for the audio data, LPCM (linear pulse code modulation) is performed for the audio data. The DVD recorder then encodes and compresses video data according to MPEG, multiplexes the encoded video data and audio data into a piece of MPEG System stream according to MPEG System, and records the piece of MPEG System stream.
With MPEG System, audio data and video data, which have been encoded and compressed, are divided into audio packets and video packets that have predetermined sizes, and time-division multiplexed into MPEG System stream. Hereafter, the terms xe2x80x9caudio dataxe2x80x9d and xe2x80x9cvideo dataxe2x80x9d are used to represent audio data and video data that have been encoded and compressed. MPEG System stream has a hierarchical structure composed of a pack and a packet, with one pack being composed of one or more packets. For instance, a pack recorded on a DVD-RAM is composed of one packet. For ease of explanation, one pack is assumed to be composed of one packet in this specification, as is the case with a pack recorded on a DVD-RAM.
FIG. 1 shows a construction of a pack and packet generated according to MPEG System.
Each packet is 2KB, and contains a pack header 11, a packet header 12, and a payload 13.
The pack header 11 contains an SCR (system clock reference) that shows a time at which the pack should be inputted to a video buffer or an audio buffer in an MPEG decoder.
The packet header 12 contains the following information: a stream ID that identifies the content of the payload 13; a DTS (Decoding Time Stamp) showing a decoding start time; and a PTS (Presentation Time Stamp) showing a presentation time. Note that an audio pack does not contain a DTS since audio data is decoded and presented almost simultaneously.
The payload 13 is audio data or video data.
Audio data is usually divided into audio packets that each contain audio data corresponding to one audio frame, and therefore a large-capacity audio buffer is not required by the MPEG decoder. As with video data, however, video frames have different sizes, and the differences in the size between different video frames are very large. For instance, video data corresponding to one video frame may be divided into a plurality of video packets. Accordingly, an MPEG decoder is required to have a video buffer that has at least the same size as a video frame of the largest size. Packs are positioned in MPEG System stream in order of an SCR, the earliest SCR first.
FIG. 2 is a diagram showing a standard decoder for MPEG System stream.
This MPEG decoder comprises the following elements: an STC (system time clock) 21 that generates a system time based on which the MPEG decoder operates; a demultiplexer 22 that separates system stream into audio packets and video packets based on a stream ID of each packet; a video buffer 23 that temporarily buffers video data; a video decoder 24 that decodes video data; a reordering buffer 25 that temporarily stores video data to be referred to by other video data; a switch 26 that is used to adjust output order of video data; an audio buffer 27 that temporarily buffers audio data; and an audio decoder 28 that decodes audio data.
The following describes decoding operations by this MPEG decoder.
A pack is extracted to be inputted to the demultiplexer 22 when a system time generated by the STC 21 agrees with an SCR written in the pack. The demultiplexer 22 then refers to a stream ID of the inputted pack and sends a packet in the pack to either the video buffer 22 or the audio buffer 27 accordingly. The video buffer 23 accumulates payloads of packets sent by the demultiplexer 22 and manages a DTS and a PTS of each packet. The video decoder 24 reads video data that has a DTS equal to a current system time from the video buffer 23. This read video data corresponds to one video frame. The video decoder 24 then decodes the read video data. Following this, video data (i.e., an I picture or a P picture) which is referred to for encoding of other pictures is temporality buffered by the reordering buffer 25, and selectively outputted in accordance with a PTS via the switch 26. The video decoder 24 decodes video data (i.e., B picture) that is encoded referring to other pictures, and outputs it immediately. On receiving audio data that is a payload of each audio packet from the demultiplexer 22, the audio buffer 27 buffers it and manages a PTS in the audio packet. The audio decoder 28 reads audio data that has a PTS equal to a current system time from the audio buffer 37. This read audio data corresponds to one audio frame. The audio decoder 28 then decodes the read audio data.
In order to present images without delays, MPEG defines that an MPEG decoder starts decoding video data only after the video buffer 23 has become full. This generates a time lag between a start of accumulation of packets in the video buffer and a start of decoding for video data. This time lag is called xe2x80x9cvbv_delayxe2x80x9d in MPEG. MPEG also defines a capacity of a video buffer as 224 KB, and data of a size exceeding this capacity is not allowed to buffered. MPEG further defines that the video buffer cannot buffer the same data for one second or longer.
To control a video buffer in accordance with MPEG in this way, an MPEG encoder assigns an SCR and a DTS to each pack appropriately when such packs are recorded.
In this way, video data has to be inputted to a video buffer a certain time before the video data is presented, while audio data has to be inputted to an audio buffer only shortly before its presentation. Accordingly, when video data and audio data should be presented simultaneously, the video data is multiplexed in prior to the audio data.
When the sequence of images is recorded after one recording for another sequence has been completed, the aforementioned time lag xe2x80x9cvbv_delayxe2x80x9d is generated between the two sequences, which prevents the two sequences from being continuously reproduced. This can happen, for instance, when a program and commercials are broadcasted and only the program is recorded without the commercials being recorded, or when different sequences, which are not consecutive, of images are taken by a digital video camera.
When packets corresponding to different sequences, which are not consecutive, are joined together to prevent the vbv_delay from being generated, however, a video buffer in the MPEG decoder can no longer be used in accordance with MPEG standard, and therefore may suffer a breakdown.
FIG. 3A shows transition of a size of video data buffered in a video buffer in an MPEG decoder used when video data for a sequence of consecutive images is reproduced. FIG. 3B shows transition of a size of video data buffered in the video buffer when video packets, which are not consecutive and have been joined together, are reproduced.
In FIG. 3B, the video buffer overflows at time t3 as a result of non-consecutive packets composed of packets corresponding to a period before time t1 in FIG. 3A and packets corresponding to a period after time t2 being joined together by discarding packets corresponding to a period from t1 to t2.
Joining non-consecutive packets without special considerations being given also causes the following problems.
To simultaneously present sound and images, an audio pack is multiplexed into MPEG System stream after a period equal to xe2x80x9cvbv_delayxe2x80x9d has passed since a video pack was multiplexed into the MPEG System stream. As a result, if video packets and audio packets corresponding to a certain period in MPEG System stream are discarded, and packets before and after the certain period are joined together, sound corresponding to images immediately before the joined part of the stream are lost while sounds corresponding to the discarded images remain.
Secondly, since an audio frame and a video frame have different frame generation frequencies, deleting certain audio frames and video frames in units of respective frames results in generating a time lag, the so-called xe2x80x9clip sync (synchronization) lagxe2x80x9d, between sound and images for frames that follow the deleted frames. For instance, Dolby Digital AC-3 compresses audio data for a DVD-RAM as audio frames whose frame generation frequency is 32 msec although a video frame has a frame generation frequency of 33.3667 msec. Accordingly, the lip sync lag qill almost certainly occur if certain audio frames and video frames are deleted in units of respective frames.
Lastly, when two non-consecutive audio frames are joined together after certain audio frames are deleted, the two non-consecutive audio frames often do not have similarities. As a result, noise is generated when these audio frames are inputted to an audio buffer to be reproduced.
Accordingly, it is not appropriate to delete certain packets or frames from MPEG System stream and to join remaining packets or frames together without special considerations being given.
The present invention is made in view of the above problems, and aims to provide an encoding/recording device and an encoding/recording method for consecutively recording different pieces of AV stream, which are not consecutive in terms of time, without causing malfunctions to occur in the above video buffer in the MPEG decoder, and to provide a recording medium storing a program to have a computer perform the encoding/recording method.
The present invention also aims to provide an encoding/recording device, an encoding/recording method, and a recording medium storing a program to have a computer perform the encoding/recording method for recording non-consecutive AV data in a manner that allows the recorded non-consecutive AV data to be reproduced without troubles relating to sound, such as unnecessary noise and a time lag between images and sound like xe2x80x9clip syncxe2x80x9d lag, being involved.
In order to achieve the above objects, the present invention relates to an encoding/recording device that receives and encodes a video signal and an audio signal, multiplexes the encoded video signal and audio signal to produce a system stream, and records the system stream on a recording medium. The encoding/recording device includes: a video data generating unit operable to (a) estimate an amount of data, which occupies a buffer in a decoder when the decoder decodes the encoded video signal (b) store how the amount of data changes over time as buffer information, (c) receive a video signal, (d) encode the received video signal to generate video data in a manner that prevents the decoder from overflowing and underflowing, and (e) update the buffer information whenever video data is generated; an audio data generating unit operable to receive an audio signal, and encode each part of the received audio signal to generate audio data, each part having a predetermined size; a multiplexing/recording unit operable to multiplex the generated video data and audio data, and record the multiplexed video data and audio data onto the recording medium; a pause controlling unit operable to specify a pause timing on receiving a pause instruction that suspends recording by the encoding/recording device from a user during the recording, and have the video data generating unit, the audio data generating unit, and the multiplexing/recording unit suspend operations; and a pause release controlling unit operable to specify a pause release timing on receiving a pause release instruction from the user, and have the video data generating unit, the audio data generating unit, and the multiplexing/recording unit resume the operations. Under control of the pause controlling unit, the video data generating unit suspends reception of the video signal with the specified pause timing without resetting the stored buffer information. Under control of the pause release controlling unit, the video data generating unit resumes the reception the video signal with the specified pause release timing and resumes encoding the received video signal based on the stored buffer information. Under control of the pause controlling unit, the audio data generating unit suspends reception of the audio signal with the specified pause timing. Under control of the pause release controlling unit, the audio data generating unit resumes receiving the audio signal with the specified pause release timing, and resumes encoding the received audio signal. The multiplexing/recording unit suspends multiplexing and recording with a predetermined timing that is equal to or later than the specified pause timing under control of the pause controlling unit, and resumes the multiplexing and the recording with a predetermined timing that is equal to or later than the specified pause release timing under control of the pause release controlling unit.
When the user records non-consecutive pieces of AV stream, using the above suspension function, the recorded pieces of AV stream can be consecutively and xe2x80x9cseamlesslyxe2x80x9d reproduced with no intervals being generated between the pieces of AV stream. In addition, the video buffer in the MPEG decoder will not be broken even when AV data, which has been recorded after the recording pause, is reproduced.
Here, the video data generating unit may encode the received video signal according to MPEG (Moving Picture Expert Group).
For this construction, the video data generating unit can estimate an amount of data that occupies a buffer in an MPEG decoder and encode the received video signal according to MPEG in a manner that prevents the buffer from undeflowing and overflowing.
Here, the encoding/recording device may further include a timer that generates a reference time at least when the encoding/recording device operates and suspends recording. Every first predetermine period, the video data generating unit receives a video signal that corresponds to one video frame, each first predetermined period being based on a reference time generated by the timer. The pause timing specified by the pause controlling unit is synchronous to a boundary of two first predetermined periods, and the pause release timing specified by the pause release controlling unit is synchronous to a boundary of two first predetermined periods.
For this construction, input of the video signal and the audio signal is suspended in synchronization with a timing with which a boundary of two video frames appears. As a result, recording can be suspended for each video frame, and no time lags are generated between reproduced images and sound which were recorded after the suspension was released.
Here, under control of the pause controlling unit, the multiplexing/recording unit may suspend the multiplexing and the recording after a second predetermined period has passed since the specified pause timing, and may hold video data and audio data which have not been multiplexed and recorded. Under control of the pause release controlling unit, the multiplexing/recording unit may resume the multiplexing and the recording by multiplexing and recording the held video data and audio data after the second predetermined period has passed since the specified pause release timing.
For this construction, the multiplexing/recording unit suspends the multiplexing and the recording after the second predetermined period has passed since the specified pause timing, and holds video data and audio data that have not been multiplexed and recorded at this point. As a result, the present encoding/recording device can correctly record AV data even after the suspension is released.
Here, after a third predetermined period has passed since the specified pause timing, the video data generating unit may complete encoding for the video signal which has been received before the specified pause timing. The second predetermined period for the multiplexing/recording unit may be a sum of the first predetermined period and the third predetermined period.
For this construction, the multiplexing/recording unit suspends the recording and the multiplexing after a time, which has the same duration as a frame generation cycle for video data, has passed since generation of video data was completed, and resumes the recording and the multiplexing after a time having the same duration as the frame generation cycle has passed since when generation of video data was resumed. As a result, it becomes possible to make a period for which the multiplexing/recording unit suspends the multiplexing and the recording approximately equal to a period for which reception of the audio signal and the video signal is suspended.
Here, the encoding/recording device may further include a first stop controlling unit operable to specify a stop timing on receiving a stop instruction from the user while the encoding/recording device operates, and have the video data generating unit, the audio data generating unit, and the multiplexing/recording unit stop operations. Under control of the first stop controlling unit, the video data generating unit may stop receiving the video signal with the specified stop timing, and encode the video signal received before the stop timing to generate video data. Under control of the first stop controlling unit, the audio data generating unit may stop receiving the audio signal with the specified stop timing, encode each part of the predetermined size contained in the received audio signal to generate audio data, and abandon, if the received audio signal contains a subpart that is smaller than the predetermined size, the subpart. Under control of the first stop controlling unit, the multiplexing/recording unit may multiplex and record all the audio data and video data that have been generated.
This construction allows the present encoding/recording device to completely stop its operations.
Here, the encoding/recording device may further include: a second stop controlling unit operable to specify a stop timing on receiving a stop instruction from the user while the encoding/recording device suspends recording in response to a recording pause instruction, and to have the video data generating unit, the audio data generating unit, and the multiplexing/recording unit stop operations. Under control of the second stop controlling unit, the video data generating unit may encode the video signal received before the specified pause timing to generate video data. Under control of the second stop controlling unit, the audio data generating unit may encode each part of the audio signal received before the specified pause timing to generate audio data, and abandons, if the received audio signal contains a subpart, the subpart. Under control of the second stop controlling unit, the multiplexing/recording unit may multiplex and record the held video data and audio data, and the generated audio data and video data.
This construction allows the present encoding/recording device to properly stop after the suspension.
Here, the audio data generating unit may contain: an audio signal sampling unit operable to receive the audio signal, sample the received audio signal to generate sets of sampled data that each correspond to one audio sampling cycle, each set of sampled data being generated every fourth predetermined period that is different from the first predetermined period, and suspend reception of the audio signal with the specified pause timing; an encoding unit operable to receive a set of sampled data from the audio signal sampling unit every fourth predetermined period, and encode the received set of sampled data to generate audio data corresponding to one audio sampling cycle; a fractional data holding unit operable to hold a set of sampled data which corresponds to a time shorter than the fourth predetermined period when the audio signal sampling unit suspends the reception of the audio signal. The audio signal sampling unit may resume receiving the audio signal with the specified pause release timing, sample the received audio signal to generate a set of sample data, join the generated set of sampled data and the held set of sampled data together to produce a set of sampled data corresponding to one audio sampling cycle, output the produced set of sampled data, and thereafter output a set of sampled data corresponding to one audio sampling cycle every fourth predetermined period.
For this construction, audio data smaller than a predetermined size can be held, so that no time lags occur between images and sound recorded after the suspension is released.
Here, under control of the pause controlling unit, the multiplexing/recording unit may suspend the multiplexing and the recording after having multiplexed and recorded all the video data generated by the video data generating unit.
With this construction, the present encoding/recording device suspends recording only after having recorded all the generated video data. Accordingly, video data can be correctly multiplexed and recorded after the suspension is released.
Here, when suspending the multiplexing and the recording, the multiplexing/recording unit may hold audio data that has not been multiplexed and recorded. Under control of the pause release controlling unit, the multiplexing/recording unit may resume the multiplexing and the recording by starting multiplexing and recording (a) the held audio data and (b) video data generated after the specified pause release timing.
For this construction, audio data, to which a later multiplexing order is applied, is held when the recording and the multiplexing are suspended. When the suspension is released, the held audio data are multiplexed after video data, which is generated after the suspension is released, are multiplexed. As a result, when images and sound recorded after the suspension are reproduced, a time lag will not be generated between the images and sound.
Here, the encoding/recording device may further include a first stop controlling unit operable to specify a stop timing on receiving a stop instruction from the user while the encoding/recording device operates, and to have the video data generating unit, the audio data generating unit, and the multiplexing/recording unit stop operations. Under control of the first stop controlling unit, the video data generating unit may stop receiving the video signal with the specified stop timing, and encode the video signal received before the stop timing to generate video data. Under control of the first stop controlling unit, the audio data generating unit may stop receiving the audio signal with the specified stop timing, encode each part of the predetermined size contained in the received audio signal to generate audio data, and abandon, if the received audio signal contains a subpart that is smaller than the predetermined size, the subpart. Under control of the first stop controlling unit, the multiplexing/recording unit may multiplex and record all the audio data and video data that have been generated.
For this construction, the encoding/recording device can stop after a recording operation.
Here, the multiplexing/recording unit may further contain a second stop controlling unit operable to specify a stop timing on receiving a stop instruction from the user while the encoding/recording device suspends recording, and have the video data generating unit, the audio data generating unit, and the multiplexing/recording unit stop operations. The audio data generating unit may abandon a subpart if the received audio signal contains the subpart. Under control of the second stop controlling unit, the multiplexing/recording unit may multiplex and record all the held audio data.
For this construction, the encoding/recording device can correctly stop after the recording pause.
Here, the audio data generating unit may contain: an audio signal sampling unit operable to receive the audio signal, sample the received audio signal to generate sets of sampled data that each correspond to one audio sampling cycle, each set of sampled data being generated every fourth predetermined period that is different from the first predetermined period, and suspend reception of the audio signal with the specified pause timing; an encoding unit operable to receive a set of sampled data from the audio signal sampling unit every fourth predetermined period, and encode the received set of sampled data to generate audio data corresponding to one audio sampling cycle; a fractional data holding unit operable to hold a set of sampled data which corresponds to a time shorter than the fourth predetermined period when the audio signal sampling unit suspends the reception of the audio signal. The audio signal sampling unit may resume receiving the audio signal with the specified pause release timing, sample the received audio signal to generate a set of sampled data, join the generated set of sampled data and the held set of sampled data together to produce a set of sampled data corresponding to one audio sampling cycle, output the produced set of sampled data, and thereafter output a set of sampled data corresponding to one audio sampling cycle every fourth predetermined period.
For this construction, audio data smaller than a predetermined size is held during the recording pause, so that no time lag occurs between images and sound recorded after the suspension is released.
Here, the audio data generating unit may contain a muting unit operable to lower a sound intensity of the audio signal before reception of the audio signal is suspended, and to restore the sound intensity before the reception of the audio signal is resumed.
With this construction, unnecessary noise is not generated when a recorded part corresponding to the recording pause is reproduced.
Here, the video signal received by the video data generating unit may correspond to a plurality of video frames. The video data generating unit may encode the received video signal using time correlation properties between some of the plurality of video frames, and generate a plurality of GOPs (groups of pictures). Each GOP may be video data corresponding to a plurality of video frames that are encoded with reference only to video frames in the same GOP. When suspending the reception of the video signal, the video data generating unit may complete generation of a GOP.
For this construction, generation of a GOP is completed immediately before the recording is suspended. Accordingly, a random access can be performed to the recorded MPEG System stream, such as when a reproduction in fast-forward mode or in fast-rewind mode, or a reproduction from a midpoint of the MPEG System stream is performed.
Here, the multiplexing/recording unit may generate system stream by multiplexing audio data and video data, and the system stream may be composed of a plurality of video object units (VOBUs). Each VOBU may be composed of at least one GOP and audio data related to the at least one GOP, and have a representation time shorter than a predetermined time. When suspending the recording, the multiplexing/recording unit may complete generation of a VOBU using video data, which has been generated by the video data generating unit using the video signal received before the specified pause timing, and audio data corresponding to a decoding order that is earlier than the decoding order corresponding to the used video data. The decoding order may be an order in which the decoder decodes the video data and the audio data. When resuming the recording, the multiplexing/recording unit may make video data, which has been generated by the video data generating unit immediately after the specified pause release timing, video data placed first in a VOBU that follows the VOBU generated when the recording is suspended.
For this construction, generation of a VOBU (video object unit) can be completed immediately before the recording is suspended, which enhances ease of operations of the present encoding/recording device when AV data is reproduced.
Here, the multiplexing/recording unit may generate a video object (VOB) composed of a plurality of VOBUs, and the VOB may contain a recording region into which a recording start time for the VOB should be written. When suspending the recording, the multiplexing/recording unit may complete generation of a VOB. When resuming the recording, the multiplexing/recording unit may newly generate a VOBU and a VOB, place the generated VOBU at a start of the generated VOB, and write a time at which the recording is resumed as a recording start time into a recording region in the generated VOB.
With this construction, generation of a VOB (video object) can be completed immediately before the recording is suspended, so that a suitable recording time can be written into a VOB recorded after the recording pause is released.
Here, when suspending the reception of the video signal, the video data generating unit may attach a sequence end code to an end of video data which has been generated from the received video signal. The sequence end code may indicate an end of video stream that is video data corresponding to a plurality of video frames.
For this construction, the video data generating unit inserts a sequence end code into an end of video data that is generated immediately before the recording is suspended. As a result, AV data can be recorded in conformity with the DVD-RAM standard.
Here, when suspending the recording, the multiplexing/recording unit may attach a sequence end code to an end of video data, which has been generated by the video data generating unit using the video signal received before the specified pause timing. The sequence end code may indicate an end of video stream that is video data corresponding to a plurality of video frames.
With this construction, the multiplexing/recording unit inserts a sequence end code into an end of video data that is recorded immediately before the recording is suspended. This allows AV data to be recorded in conformity with the DVD-RAM standard.