The present invention relates to a disk media, such as a video disk or optical disk for recording a digital image signal, such as an image signal having been coded, e.g., converted to high-efficiency coded data, and a method of and device for recording and playing back a digital image signal on or from such a disk media, a high-efficiency-coded data on a video disk, and method and device for playing back the image by restoring the high-efficiency-coded data from the video disk.
The present invention also relates to a method of performing fast playback or retrieval from a disk media.
FIG. 40 shows a conventional optical disk recording/playback device described in Japanese Patent Kokai Publication No. 114369/1992. As illustrated, it comprises an A/D converter 12 for converting a video signal, audio signal, or the like into digital information, an information compressing means 13, a frame-sector conversion means 14 for converting compressed information into sector information whose length is equal to an integer multiple of a frame, an encoder 15, and a modulator 16 for conversion into modulated codes so as to reduce inter-code interference on a recording media. A laser drive circuit 17 and a laser output switch 18 serve to modulate a laser beam according to the modulated codes.
An optical head 19 is for emitting the laser light. An actuator 20 is for tracking the light beam emitted from the optical head 19. A traverse motor 21 is for moving the optical head 19. A disk motor 22 is for rotating a disk 23. Reference numeral 24 denotes a motor drive circuit, 25 denotes a first control circuit, and 26 denotes a second control circuit. A playback amplifier 27 is for amplifying a playback signal sent from the optical head 19. A demodulator 28 is for acquiring data from a modulated signal that has been recorded. Reference numeral 29 denotes a decoder, and 30 denotes a frame-sector inverse conversion circuit. An information expanding means 31 is for expanding compressed information. A D/A converter 51 is for converting expanded information into, for example, an analog video signal or audio signal.
FIG. 41 is a simplified illustration of a data structure (layered structure) according to the moving picture experts group (hereinafter MPEG) system that is becoming the standard for transmission and storage of digital moving picture information in compressed form.
In FIG. 41, reference numeral 51 denotes a sequence formed of a plurality of image information blocks, also called GOPs (groups of pictures) 52 and sequence headers. Each GOP 52 is formed of a plurality of pictures (screens) or image data for a plurality of frames 53. Each picture (screen) is divided into slices 54, and each of the slices 54 is formed of a plurality of macroblocks 55. Each macroblock 55 is formed of four adjoining blocks 56y of luminance signal (Y), one block 56b of a color difference signal (Cr), and one block 56r of another color difference signal (Cb). The positions of the blocks 56b and 56r of the color difference signals are associated with the positions of the four blocks 56y of the luminance signal.
One block 56y of luminance signal is formed of eight pixels by eight pixels, and forms a minimum coding unit.
The block 56y, 56b, 56r is regarded as a unit for information compression based on discrete cosine transform (hereinafter DCT). The macroblock 55 is a minimum unit for motion-compensated prediction. Detection of a motion vector used for the motion-compensated prediction is carried out taking in macroblock units, with regard to each macroblock.
The coded data is output as a bit stream (continuous serial data) having a structure described above.
The sequence 51 has a structure shown in FIG. 42. In the figure, 65a, 65b, 65c and 65d denote GOPs, and 66a, 66b and 66d denote sequence headers (SHs). The sequence headers are provided to designate the image format such as the number of pixels, the number of lines of the image, and may be appended to the head of all or only some of the GOPs. In the figure, GOP1, GOP2 and GOP4 are provided with a sequence header appended to the head thereof, while GOP3 is not provided with a sequence header. Provided at the start of the GOP is data (hereinafter referred to as “time code”) indicating the time from the start of the sequence (title, program).
FIG. 43 shows a coding scheme for a case where one GOP 52 is composed of ten pictures (screens, frames). In FIG. 43, reference numeral 67 denotes an I picture that is image information subjected to information compression based on intra-frame DCT. 68 denotes a P picture that is image information subjected to the information compression based on intra-frame DCT as well as to motion compensation using the temporally preceding I picture 67 as a reference screen. 69 denotes a B picture subjected to the information compression based on intra-frame DCT and to motion compensation using the temporally preceding and succeeding I and/or another P pictures 67, 68 as reference screens.
Next, the operations of the conventional optical disk recording/playback device will be described. With the advancement of compression of digital image information technology, it has become possible to realize an image filing system in which compressed moving picture information is recorded on optical disks, offering more excellent retrievability than tape media represented by a conventional VTR, and which is easy to use. Since this kind of disk filing system handles digital information, degradations due to copying are not observed. Moreover, since optical recording/playback is employed, a non-contacting and therefore reliable system can be constructed.
Conventionally, recording of compressed moving picture information on an optical disk is achieved by recording digital compressed moving picture information, which conforms to the MPEG system shown in FIG. 41, in an optical disk device shown in the block diagram of FIG. 40. Image information digitized by the A/D converter 12 is transformed by the information compressing means 13 according to the MPEG or any other standard compressed moving picture system. The compressed information is encoded by the encoder 15, and modulated by the modulator 16 in order to reduce the influence of inter-code interference on the optical disk 23. The resultant information is recorded on the optical disk 23. At this time, data is allocated such that, for example, the amount of data per GOP is substantially identical (in other words, at a fixed rate), and data is allocated to sectors whose length is equal to an integer multiple of a frame. This facilitates GOP-by-GOP editing or the like.
For playback, image information read from the optical disk 23 is amplified by the playback amplifier 27. Digital data is then restored by the demodulator 28 and decoder 29. Thereafter, pure and original image data devoid of address and parity bits is restored by the frame-sector inverse conversion means. The information expanding means 31 performs MPEG decoding so as to restore the original digital video signal. The D/A converter 32 provides an analog video signal that can be displayed on a monitor or the like.
Assuming that the aforesaid MPEG system is used for digital moving picture compression, a coding scheme such as the one shown in FIG. 43 is recorded on the optical disk 23 as it is. Herein, the coding scheme is constructed by combining the I picture 67, which has been subjected to information compression based on intra-frame DCT, with several P pictures 68 which have been subjected to information compression by intra-frame DCT and motion compensation using the temporally preceding I picture 67 or another P picture 68 as a reference screen, and several B pictures 69 which have been subjected to information compression, by intra-frame DCT and motion compensation using the I and/or P pictures 67, 68 as reference screens.
The P and B pictures may be coded by reference to other pictures are coded, such that the arrow-headed lines schematically illustrating the relationship between reference pictures and the pictures (predicted pictures) coded using the reference pictures within one GOP as shown in FIG. 43. With such an arrangement, the P and B pictures are coded by reference to other pictures within the same GOP, then the image signal within one GOP can be decoded independently.
An I picture 67 results from intra-frame DCT, so that an image can be reproduced using the I picture 67 alone. However, with regard to a P picture 68 that results from forward motion compensation, an image cannot be reproduced until the I picture 67 is reproduced. As for a B picture 69 resulting from forward and backward motion compensation, an image cannot be reproduced until the preceding and succeeding I and/or P pictures 67, 68 are reproduced. The B picture 69 resulting from forward and backward motion compensation therefore contains the least amount of data and is coded most efficiently. By contrast, the I picture 67 resulting from compression based solely on intra-frame DCT contains the largest amount of data and is coded least efficiently.
Coding efficiency can be improved by increasing the number of B pictures 69. Increasing the number of B pictures requires increase in the storage capacity of a buffer memory for storing the I and P pictures 67 and 68 necessary for reproducing the B pictures 69. Moreover, a delay time from the input of data to image reproduction is longer. However, a greater demand on the storage media such as an optical disk is a higher compression efficiency to achieve a longer-time recording, and the delay time for image reproduction does not pose a critical problem. The coding scheme shown in FIG. 43 is therefore suitable.
When data having the above coding scheme is recorded on an optical disk, fast retrieval and playback of an image are accomplished as described below.
That is, when data has the coding scheme shown in FIG. 43, fast playback is enabled by consecutively reproducing only the data representing I pictures 67. After data representing an I picture 67 belonging to a certain GOP is reproduced, a track jump is made to another preceding or succeeding GOP, or at an arbitrary GOP distance, to consecutively reproduce the data of I pictures 67, and to thereby realize fast retrieval or playback at a speed of (number of frames constituting a GOP)×(track jump distance in terms of the number of GOPs) times the normal speed.
Recording digital video signal on recording media such as optical disk using the data compression coding method according to the MPEG system can be achieved either by a method of recording the image signal data of each GOP as a variable amount of data, i.e., recording each GOP with a variable data rate, in order to maintain the picture quality constant between GOPs, as shown in FIG. 44A, or by a method of recording each GOP with a fixed amount of data, i.e., recording each GOP with a fixed data rate, in order to maintain the recording time of each GOP constant, as shown in FIG. 10B.
The former method is advantageous in increasing the recording density on the disk, while the latter method is advantageous in that it is easy to predict the recording position of the image data in retrieving an image signal at a known time from the start of the image of one sequence (title, program).
In the former method, the amount of data per GOP varies with time depending on the nature of the pictures forming the GOP, as shown in FIG. 45A, in which (α) represents the maximum data rate and (β) represents the average data rate. For instance, the picture quality per GOP and the amount of data for each of the three types of images V1, V2 and V3 are as shown in FIG. 45B. It is seen that in the former method, the picture quality is maintained constant by varying the amount of data per GOP of the image.
A disk playback device using the image signal coding method according to the MPEG method is a video CD (compact disc) player. FIG. 46 schematically illustrates a track configuration in a video CD and a data configuration within the user region of one sector in a track. A margin of a predefined number of sectors is provided at the head and tail of each track, and other sectors in combination form one unit of transmission (pack) of MPEG data. Time stamp data indicating the time from the start of the recorded sequence (title, program) is recorded at the head of the one pack of image data.
The method of recording each GOP with a fixed amount of data, i.e., with a fixed data rate is used for the image signal coded by the MPEG system.
In such video CD, the image signal and audio signal of one entire sequence (title, program) that are recorded are treated as one data file. The GOPs forming the data are successively recorded in consecutive sectors on the disk as consecutive data as shown in FIG. 42. File management data such as file identification data and start sector address, not shown, are recorded in the track at the head of the disk, and the access to the file consisting of the image signal and the audio signal of the desired title can be made on the basis of the file management data.
Image signal of the desired sequence (title, program) can be reproduced by successively accessing the sectors, from the sector at the head of the region where the file is recorded, in accordance with the start sector address of the data file corresponding to the sequence, by referring to the file management data.
Generally, data on the recording media such as disks are physically recorded in sectors forming units of recording, and recording (writing) and reproduction (reading) of data are performed taking each sector as a unit of access.
When a GOP in the middle of a sequence (title, program) is to be reproduced, a sector in the middle of the succession of the sectors where the data file is recorded is accessed first. However, the GOP data in the pack of each sector is recorded as consecutive data, as shown in FIG. 46, the GOP data data read first is from the middle of a GOP, and the other part partially recorded in the immediately preceding sector will be dropped in the reproduced data.
Accordingly, the reference picture data used for coding P and B pictures in the GOP read first is missing, and image obtained by decoding them would be unnatural, so that they would not be used for playback of the image.
The GOP in the middle of a sequence (title, program), at a desired time from the start of the sequence is to be reproduced, the sector address where the desired GOP is recorded is first predicted, on the basis of the fixed data rate of the recorded image signal. Then, access is made to the predicted sector address, and the signal recorded in the sector is read, and the time stamp data in it is detected. By comparing the contents of the detected time stamp data with the desired time instant, the sector where the desired GOP is recorded is identified. Then, the GOPs are successively read, from the first GOP recorded in the sector, and the time code at the head of each GOP is detected, and when the time code is of the desired time instant, the GOP is found to be the desired GOP, and the GOP is decoded to produce the playback image.
In a coding scheme recorded on a video disk is configured as described above, only I-pictures can be decoded by themselves. An image that can be retrieved has therefore been limited to the I pictures.
Moreover, with the conventional video disk recording/playback device or playback device, it is not possible to identify the video disk being recorded or played back, so that images to be retrieved, images from which the playback should starts, and the like are not known.
When a GOP in the middle of a sequence (title, program) is reproduced, the GOP which is read first from the sector accessed first has its part missing, so that it would produce unnatural image if it were decoded and output. Accordingly, it is not used, and as a result, there is a time delay before the image signal is reproduced and displayed.
Moreover, when the conventional recording method is applied to a write-once media, and when editing, such as overwriting or tag recording is conducted, the image signal is recorded consecutively in the recording region within the sector, so that when the GOP to be overwritten is in the middle of the sector, and if the entire sector is overwritten, the tail part of the GOP preceding the GOP to be edited will be missing, and when the edit point is reproduced, the GOP preceding the GOP having been edited will not be reproduced and the image will be missing. Even if the reproduction is forcibly made, a resultant image would be unnatural.
Furthermore, when the conventional recording method is applied to a write-once media, the entire sequence is treated as one file, and recorded in consecutive sectors on the disk. During playback, the position on the disk where the file is recorded can be identified only by the start sector address which is the file management data. It is not possible to utilize, for recording and playback, vacant sectors which result by repeating erasure and recording. Thus, the recording regions on the disk are not effectively utilized.
In addition, when a GOP in the middle of a sequence (title, program) is reproduced, and when a GOP at a desired time from the start of the sequence is to be reproduced, it is necessary to follow a complicated process wherein the sector address of the sector where the desired GOP is recorded is predicted on the basis of the data rate of the image signal, and by comparing the time stamp data indicating the recording time of the sector and desired time instant, the sector where the desired GOP is recorded is identified, and by comparing the time code at the head of the GOPs successively read from the sector, with the desired time instant, the desired GOP is identified. It is therefore not possible to promptly identify the sector where the desired GOP is recorded, and there is a certain time delay before the image signal is reproduced. Moreover, where the image signal of each GOP is recorded with a variable data rate, the time from the start of the sequence and the recording position are not proportional, so the prediction of the sector address of the sector where the GOP of the desired time instant is recorded is difficult, and there is a further delay before the image signal is reproduced.
Furthermore, in the conventional optical disk recording/playback device having the aforesaid configuration, only the I pictures of GOPs are reproduced consecutively. When it is taken into account that human eyes are sensitive to what is called a “scene change;” such as a change in strength of a luminance signal, the fast playback or retrieval is not always satisfactory to viewers.
In addition, as for fast playback or retrieval achieved by consecutively reproducing I pictures alone, positions of the I pictures in GOPs do not have correlation to positions of the I pictures on recording tracks on an optical disk. When an image compression ratio for recording is varied, the length of each GOP itself is not fixed. The correlation becomes even less. When a track jump is made, it is difficult to specify a start position of an I picture of each GOP. Every time a jump is made to another track, a random rotation wait time arises, and consecutive reproduction of I pictures cannot be made smoothly.
Furthermore, the speed of fast playback or retrieval cannot be raised in harmony with human visual characteristics.