The present invention relates to a recording medium to and from which digital data may be written and read, to a recording unit recording digital data on it, and to a playback unit playing back digital data from it. Particularly, the present invention relates to an optical disk on which multimedia data, such as video data, still image data, and audio data, may be recorded and to a recording unit and playback unit.
A phase-change disk DVD-RAM (Digital Versatile Disc-RAM) with a capacity of several GB (Giga Bytes) has been introduced into the field of writable optical disks with a maximum capacity of about 650 MB (Mega Bytes). As MPEG (MPEG2), the standard for coding digital AV (Audio and Video) data, is employed for practical use, DVD-RAM is now expected for use not only on computers but also as recording and playback media in the AV field. That is, it is predicted that DVD-RAMs will become media replacing magnetic tapes which have been used as standard AV recording media.
(Description of DVD-RAM) Recently, as the recording density of a rewritable optical disk increases, not only computer data or audio data but also image data may be recorded on the optical disk. For example, on the signal-recording surface of an optical disk, the guide grooves in the form of projection and ditch have been provided conventionally.
In former days, signals were recorded only in the projection or the ditch positions. The introduction of the land-groove recording method has made it possible for signals be recorded in both the projection and the ditch positions. This method has achieved about twice as high density as before (For example, Japanese Patent Laid-Open Application No. JP-A-8-7282).
The CLV method (Constant Line Velocity) efficiently increases the recording density. A method such as the zone CLV method which makes the CLV method easier to control and implement was also devised and put into practical use (For example, Japanese Patent Laid-Open Application No. JP-A-7-93873).
One of major problems with an optical disk with an ever-increasing capacity is how to record AV data, including image data, and how to implement performance and new functions much higher than those of conventional AV equipment.
With the advent of this large-capacity, rewritable optical disk, it is expected that tapes which have been used in most cases for AV data recording and playback will be replaced by optical disks. A shift in recording media from tapes to disks will have various influences on the function and performance of AV equipment.
One of the most prominent advantages of the shift to disks is a great improvement in the random access performance. An attempt to make a random access to data on a tape involves rewinding one volume of tape which will usually take the order of minutes. This is much larger than the seek time (several ten milli-second or less) of optical disk media. Thus, the tape cannot be used practically as a random access device.
This random access performance of an optical disk makes possible the distributed recording of AV data which would be impossible on conventional tapes.
FIG. 1 is a block diagram showing the drive of a DVD recorder. In the figure, reference numeral 11 is an optical pickup which reads data from the disk, 12 is an ECC (error correcting code) processor, 13 is a track buffer, 14 is a switch switching input/output of the track buffer, 15 is an encoder, 16 is a decoder, and 17 is the enlarged view of a recording area on the disk.
As shown in 17, the minimum unit of data recorded on the DVD-RAM disk is 1 sector=2 KB. The ECC processor 12 performs error correction processing on 16 sectors=1 ECC block.
The track buffer shown by 13 is a buffer used to record AV data at variable bit rates to efficiently record AV data on the DVD-RAM disk. This buffer acts as a buffer to resolve the difference between the DVD-RAM read/write rate (Va in the figure) which is constant the and the AV data bit rate (Vb in the figure) which varies according to the complexity of the contents (such as image data of video).
More efficient use of this track buffer 13 allows AV data to be distributed on the disk. This is described below using FIGS. 2A and 2B.
FIG. 2A is a diagram showing the address space of the disk. As shown in FIG. 2A, when AV data is recorded in separate contiguous areas [a1, a2] and [a3, a4], supplying data, stored in the track buffer, to the decoder during the seek operation from a2 to a3 allows AV data to be played back continuously. FIG. 2B shows how data is accumulated into, and supplied from, the track buffer.
AV data, which is read starting from a1, is input into, and output to, the track buffer beginning at time t1. The amount of data corresponding to the difference in rate (Vaxe2x88x92Vb) between the track buffer input rate (Va) and the track buffer output rate (Vb) is accumulated in the track buffer. This condition continues until data at a2 is read (time t2). The amount of data B(t2), accumulated up to this time, is used as data that is supplied to the decoder until time t3 at which reading starts at a3 arrives.
In other words, if the amount of data ([a1, a2]) accumulated before the seek operation is equal to or larger than a sufficient amount, AV data may be supplied continuously even if the seek operation happens.
In the above example, data is read, or played back, from a DVD-RAM. The example also applies when data is written, or recorded, onto the DVD-RAM.
As described above, if the data exceeding a sufficient amount is contiguously recorded on the DVD-RAM, continuous playback/recording is possible even if AV data is distributed on the disk.
(Description of MPEG)
Next, AV data is described.
As described above, AV data recorded on a DVD-RAM uses the international standard called MPEG (ISO/IEC13818).
A DVD-RAM, with a large capacity of several GB, is not large enough to store non-compressed digital AV data. This means that AV data must be compressed before being recorded. One of the popular methods for compressing AV data is MPEG (ISO/IEC13818). A recent advance in the LSI technology makes it possible to implement an MPEG codec (compression/decompression LSI chip), allowing the DVD recorder to MPEG-compress/decompress data.
For highly efficient data compression, MPEG has the following two major characteristics:
The first characteristic is that, in addition to the conventional compression method using the spatial frequency characteristics, MPEG uses a compression method using inter-frame time correlation characteristics for compressing video data. To compress data, MPEG classifies frames (also called pictures in MPEG) into three: I picture (intra-frame coded picture), P picture (picture using intra-frame coding and a reference to the preceding picture), and B picture (picture using intra-frame coding and a reference to the preceding and following pictures).
FIG. 3 shows the relation among I, P, and B pictures. As shown in FIG. 3, the P picture refers to the immediately preceding I or P picture, while the B picture refers to the immediately preceding and following I or P picture. Also, because the B picture refers to the following I or P picture, the display order of pictures does not always match that (coding order) of compressed data as shown in FIG. 3.
The second characteristic is that MPEG allocates an amount of coding dynamically to each picture depending upon the complexity of the image. The MPEG decoder has an input buffer and accumulates data in this decoder buffer, making it possible to allocate a large amount of code to a complex image which is difficult to compress.
Audio data used on a DVD-RAM may be selected from the following three: MPEG audio data and Dolby digital data (AC-3) which are compressed and LPCM data which is not compressed. The bit rate of Dolby digital data and LPCM data is fixed. The size of MPEG audio data may be selected from several sizes in units of audio frames which are not so large as video streams.
This AV data is multiplexed into one stream using a method called a MPEG system. FIG. 4 is a diagram showing the configuration of the MPEG system. The reference numeral 41 is a pack header, 42 is a packet header, and 43 is a payload. The MPEG system has a hierarchical structure consisting of packs and packets. A packet is composed of the packet header 42 and the payload 43. AV data, divided into several pieces each in an appropriate size, is stored in the payload 43 beginning at its head. The packet header 42 contains information on the AV data stored in the payload 43; it contains the ID (stream ID) identifying the stored data as well as the decoding time DTS (Decoding Time Stamp) with precision in 90 kHz and display time PTS (Presentation Time Stamp) of the data included in the payload (For data such as audio data which is decoded and displayed almost at the same time, the DTS is omitted). A pack is a unit composed of a plurality of packets. Since one pack is used for one packet for DVD-RAM, a pack is composed of the pack header 41 and a packet (packet header 42 and payload 43). In the pack header is recorded the SCR (System Clock Reference) which is the 27 MHz-precision time at which data in the pack is input into the decoder buffer.
A MPEG system stream like this is recorded on the DVD-RAM, one pack per one sector (=2048 bytes).
Next, the decoder decoding the above-described MPEG system stream is described. FIG. 5 shows the decoder model (P-STD) of the MPEG system decoder. The reference numeral 51 is an STC (System Time Clock) measuring the standard time used in the decoder, 52 is a de-multiplexer which decodes, or de-multiplexes, a system stream, 53 is an input buffer of the video decoder, 54 is a video decoder, 55 is a re-order buffer in which I and P pictures are stored temporarily to adjust the difference between the data order and the display order of I pictures and P pictures described above, 56 is a switch adjusting the output order of the I pictures and P pictures stored in the re-order buffer, 57 is an input buffer of the audio decoder, and 58 is an audio decoder.
The system decoder having this configuration processes the above-described MPEG system stream as described below. When the time of the STC 51 matches the SCR described in the pack header, the de-multiplexer 52 receives the pack. The de-multiplexer 52 interprets the stream ID contained in the packet header and transfers the streams of data in the payload to the decoder buffer 53 or 57 for each stream. The de-multiplexer 52 also gets the PTS and DTS from the packet header. When the time of the STC 51 matches the DTS, the video decoder 54 gets picture data from the video buffer 53, decodes it, stores the I and P pictures in the re-order buffer 55, and displays the B pictures. When the picture the video decoder 54 decodes is an I or P picture, the switch 56 is switched to the output terminal of the re-order buffer 55 to output the preceding I or P picture from the re-order buffer 55; when the picture the video decoder 54 decodes is a B picture, the switch 56 is switched to the output terminal of the video decoder 54. Like the video decoder 54, when the time of the STC 51 matches the PTS (there is no DTS for audio data), the audio decoder 58 gets one frame of audio data from the input buffer 57 and decodes it.
Next, the multiplexing method of an MPEG stream is described with reference to FIG. 6. FIG. 6(a) shows video frames, FIG. 6(b) shows the video buffer, FIG. 6(c) shows an MPEG system stream, and FIG. 6(d) shows audio data. The horizontal axis, common to all figures, is the time axis. Data in each figure is drawn based on this time axis. In the figure showing the video buffer status, the vertical axis indicates the buffer occupancy (amount of data accumulated in the video buffer) with the bold line indicating the chronological change in the buffer occupancy. The slope of the bold line corresponds to the bit rate, indicating that data is input into the buffer at a constant rate. A reduction in the buffer occupancy at a regular interval indicates that data is decoded at that time. The intersection of the dotted diagonal line and the time axis indicates the time at which the transfer of video frames to the video buffer is started.
The following describes the operation with complex video data image A as an example. As shown in FIG. 6(b), the data of image A must be transferred to the video buffer at time t1 that is earlier than the decode time (The time from the data input time t1 to the decode time is called vbv_delay) because image A requires a large amount of code. As a result, the AV data is multiplexed in the position of the video pack indicated by the shaded area in FIG. 6(c). On the other hand, audio data, which does not require dynamic coding amount control as with video data, need not be transferred earlier than the decode time; in most cases, audio data is multiplexed some time earlier than the decode time. Therefore, for video data and audio data that are played back at the same time, the video data is multiplexed before the audio data. It should be noted that, for MPEG, all data except still-image data must be output from the buffer to the decoder within one second. This means that the maximum difference in the multiplexing time between video data and that of audio data is one second (Strictly speaking, the time needed for re-ordering video data may be added to the maximum time).
In this example, although video data is multiplexed before audio data, audio data may be multiplexed before video data theoretically. When highly-compressed, easy-to-process video data is prepared and the audio data is transferred much earlier, it is possible to create such data. However, because of the limitation of MPEG described above, audio data may be transferred not earlier than one second.
(Description of Digital Still Camera)
Next, a digital still camera is described.
Recently, digital still cameras using JPEG (ISO/IEC 10918-1) have become popular. The popularity of digital still cameras is due to the fact that personal computers have rapidly come into wide use recently. Images taken by digital still cameras may be easily captured into personal computers via semiconductor memory, floppy disks, infrared light communication, and so forth. The still images captured into personal computers may be used in presentation software products, word processors, and internet contents.
In addition, digital still cameras capable of capturing sounds have become used. The capability of recording sounds has given digital still cameras another advantage over conventional film cameras.
FIG. 7 shows the relation between JPEG data recorded by a digital still camera and the directories and files on a PC (personal computer).
As shown in FIG. 7, JPEG data is recorded in one file (with the extension code of xe2x80x9cJPGxe2x80x9d). When the number of files exceeds a predetermined number and it becomes difficult for the user to manage the files, they are usually organized into the directory structure, each directory including about 100 files as shown in FIG. 7.
However, the number of still images that can be recorded by a digital still camera is limited by the recording capacity of the recording media such as flash memory or floppy disks. A large number of still images cannot be recorded. For example, when still images, each 50 KB in size, are recorded in the 100 MB flash memory, the maximum number of still images that may be recorded at a time is as small as about 2,000 still pictures.
(Description of Digital VCR)
Next, a digital VCR, in particular, a DVC which has rapidly become popular recently, is described.
The introduction of the DVC has implemented new functions not provided on the conventional VCR. One of them is a recording in which video and still images are mixed.
FIG. 8 is a diagram showing how the DVC records video and still images.
As shown in FIG. 8, the DVC allows video and still images to be mixed in a sequential order on tape, allows video and still images to be alternately recorded, or allows still images to be recorded continuously just as they would on an album.
However, the DVC, which is a tape medium, lacks random accessibility. In addition, it has no management information similar to that used on the computer, making it difficult for the user to play back a particular still image the user wishes.
The introduction of the DVD-RAM means a potential new AV equipment which solves the problem of limited number of still images of digital cameras and the problem of random accessibility of the DVC and which enables the user to process tens of thousands of still images freely.
As described above, the DVD-RAM is expected as one of the next-generation AV recording media. The present invention solves the following problems which prevent the performance of the DVD-RAM from being maximized. The present invention also enables a DVD recorder to be implemented. The DVD recorder is thought of as the intended and most important application of the rewritable large-capacity optical disk DVD-RAM.
The most serious problem of processing a large amount of still image data on the DVD recorder is that the amount of management information is extremely large.
The still image data management information is described with reference to FIG. 9.
Access to still image data recorded on the disk requires information such as the address and the size of data the user is going to access.
In addition, the addition of sound data as on a digital still camera requires information not only on the address and the size but also on the playback time of the sound data. Post-recording, which is recorded separately after still image data is recorded, also requires post-recording audio data management information.
Access to the 4.7 GB data area, one sector at a time (1 sector=2048 bytes), requires 4 bytes for the address, 1 byte for still image data, and 2 bytes for sound data; in addition, for sound data, another 2 bytes is required for the playback time. The post-recording of sounds requires twice as large management information, with the total management information area being 21 bytes in size.
If 65000 still images are recorded and 21 bytes of management information is used for each still image, the size of the management information is calculated as:
65000xc3x9721 bytes=1365000 bytes
The total of about 1.4 MB of management information is required.
Although 1.4 MB of data is small as compared with the DVD recording capacity, the system controller (corresponds to the CPU of a PC) must always have this data in memory for use in random access. Despite a significant drop in the price of memory, it is unusual for AV equipment to have memory larger than one MB. And, it is impractical for AV equipment to have a battery backup for backing up the memory, larger than one MB, against an emergency.
The present invention provides a recording medium which minimizes the storage area for data management information to allow the recording area to be used efficiently, a recording unit which records data on the recording medium, and a playback unit which plays back data from the recording medium. The recording medium according to the present invention comprises a still data image area (102) in which a plurality of still image data (VOB) pieces may be recorded and an area (102) in which still image set management information (VOBSI), managing the still image data (VOB) in a part and the whole of the still image data area as a gathering still image set (VOBS), is recorded. The still image set (VOBS) has corresponding still image set management information (VOBSI).