In order to efficiently store or transmit digital image information, i.e., image data of a digital signal, it is required that the digital image information be compressively coded. As available methods for compressively coding the digital image information, there are waveform coding methods such as sub-band, wavelet, fractal, and so forth, as well as DCT (Discrete Cosine Transform) typical of an image processing technique according to JPEG (Joint Photographic Coding Experts Group) or MPEG (Moving Picture Experts Group).
Meanwhile, one method for eliminating redundant image information between adjacent frames and the like is to perform inter-frame prediction using motion compensation by representing values of pixels in a current frame by differences between these pixel values and pixel values of pixels in a previous (past) frame, and perform coding of a difference signal corresponding to the difference.
Hereinafter, an image coding method and an image decoding method according to MPEG standard which performs a DCT process including motion compensation, will be briefly described.
In this image coding method, an input image signal is divided into plural image signals respectively corresponding to plural blocks (macroblocks) in one frame, and then the image signals are coded for each macroblock. One macroblock corresponds to an image display region composed of (16×16) pixels. When the input image signal corresponds to an object image, the image signal is divided into plural blocks (macroblocks) composing a display region (object region) corresponding to the object image in one frame.
The image signal corresponding to each macroblock is divided into image signals respectively corresponding to subblocks corresponding to image display regions each composed of(8×8) pixels, and then the image signals are subjected to the DCT process for each subblock to generate DCT coefficients. Then the DCT coefficients are quantized to generate quantization coefficients for each subblock. Thus, the method for coding the image signal corresponding to the subblock by the DCT process and quantization process is termed an “intra-frame coding scheme”.
At a receiving end, the quantization coefficients are inversely quantized and are then subjected to an inverse DCT process for each subblock to reproduce an image signal corresponding to the macroblock. Coded data corresponding to a frame (I picture) in which the image signal has been coded by the intra-frame coding method, can be reproduced independently. That is, it can be decoded without referring to image data of another frame.
On the other hand, there is a coding method termed an “inter-frame coding scheme”. In this coding method, initially, a method for detecting motion of an image on a frame such as “block matching” is employed to detect a region composed of (16×16) pixels with the smallest errors between pixel values thereof and pixel values of a target macroblock to-be-coded as a prediction macroblock, from an image signal corresponding to a coded frame which is temporally adjacent to a frame to-be-coded.
Subsequently, the image signal of the prediction macroblock is subtracted from the image signal of the target macroblock to produce a difference signal of the target macroblock, which is divided into difference signals respectively corresponding to subblocks each composed of (8×8) pixels. Then the difference signals are subjected to the DCT process to generate the DCT coefficients for each subblock, which are quantized for each subblock to generate quantization coefficients.
The image signal corresponding to the object image is inter-frame coded in a similar manner.
At the receiving end, the quantization coefficients (quantized DCT coefficients) are inversely quantized and are then subjected to the inverse DCT process for each subblock to restore the difference signal of the macroblock. Then, from an image signal of a decoded frame, a prediction signal of an image signal corresponding to a target macroblock to-be-decoded is produced by motion compensation. Then, the prediction signal and the restored difference signal are added to reproduce the image signal of the target macroblock. Coded data corresponding to the frame (P picture or B picture) in which the image signal has been coded by the inter-frame coding method cannot be reproduced independently. That is, it cannot be decoded without referring to the image signal of another frame in the reproduction process.
Subsequently, a structure of compressed image data (bit stream) corresponding to a moving picture composed of plural frames (pictures), will be described.
FIG. 10(a) shows a structure of image data (moving picture data) corresponding to one moving picture. One moving picture comprises plural frames. In FIG. 10(a), moving picture data D comprises frame data P(1)–P(n)(n: natural number) corresponding to respective frames.
FIG. 10(b) shows a structure of intra-frame compressed image data Da obtained by performing the intra-frame coding process to the respective frame data P(1)–P(n) composing the moving picture data D.
The intra-frame compressed image data Da comprises coded frame data Pa(1)–Pa(n) of respective frames and a header Ha comprising data common to these frames. The frames are intra-frame coded I pictures. According to MPEG4, the header Ha is called a “VOL (Video Object Layer).”
FIG. 10(c) shows a structure of inter-frame compressed image data Db obtained by performing the intra-frame coding process to specified frame data of the frame data P(1)–P(n) and by performing the inter-frame coding process to the other frame data.
The inter-frame coding process includes two types of processing. One is a forward predictive coding process which performs coding of a target frame to-be-coded by referring to a previous (forward) frame, and the other is a bidirectionally predictive coding process which performs coding of the target frame by referring to previous and subsequent (forward and backward)frames.
The inter-frame compressed image data Db comprises coded frame data Pb(1)–Pb(n) of respective frames and a header Hb comprising data common to these frames. As illustrated, the first frame of the moving picture is the intra-frame coded I picture and the other frames are P pictures which have been subjected to the forward predictive coding process or B pictures which have been subjected to the bidirectionally predictive coding process.
Since the intra-frame compressed image data Da is produced by performing the intra-frame coding process for every frame of the moving picture without reference to another frame, it is very suitable for use in random reproduction (decoding), although its coding efficiency is relatively low.
In other words, one advantage of the use of the intra-frame compressed image data Da is that frames to-be-decoded are selected randomly and decoded immediately to reproduce an image. Particularly when editing the compressed image data, the intra-frame compressed image data is easier to handle than the inter-frame compressed image data. This is because the intra-frame compressed image data is produced independently of another frame data but the inter-frame compressed image data is not.
On the other hand, since the inter-frame compressed image data Db is produced by performing the inter-frame coding process to almost all the frames of the moving picture with reference to another frame and therefore its coding efficiency is high, it is less suitable for use in random reproduction (decoding). In the inter-frame compressed image data Db, when decoding starts from the P picture or the B picture as the target frame to-be-decoded, it is necessary to decode an independently decodable frame present before the target frame. This is because the target frame to-be-decoded is the frame which has been coded with reference to another frame.
For instance, in the intra-frame compressed image data Da, coded frame data Pae(1)–Pae(m) (m: natural number) corresponding to 30-second data positioned at the back of a one-hour moving picture can be reproduced starting from the coded frame data Pae(1) at the beginning of these frame data (see FIG. 10(b)).
On the other hand, in the inter-frame compressed image data Db, when reproducing coded frame data Pbe(1)–Pbe(m) corresponding to 30-second data positioned at the back of a one-hour moving picture, the coded frame data Pbe(1)at the beginning of these data cannot be first reproduced (see FIG. 10(c)). The coded frame data Pbe(1) cannot be reproduced until from independently reproducible data(coded frame data Pb(1) corresponding to the first frame of the moving picture) through coded frame data present just before the data Pbe(1) have been decoded. This is because the coded frame data Pbe(1) is the data which has been coded with reference to another frame.
Meanwhile, a fast forward playback process which skips S (S: natural number) frames can be performed to the intra-frame compressed image data Da (see FIG. 11(a)). This is because coded frame data Pa(1), Pas(1)–Pas(f) (f: natural number) to be decoded in the fast forward playback process correspond to intra-frame coded I pictures which can be reproduced independently without reference to another frame data. A fast rewind playback process as the reverse of the fast forward playback process, can also be performed to the intra-frame compressed image data Da in the same manner.
On the other hand, in practice, the fast forward playback process cannot be performed to the inter-frame compressed image data Db (see FIG. 11(b)). This is because each of the coded frame data Pbs(1)–Pbs(f) to be decoded at the fast forward playback process corresponds to the inter-frame coded P picture or the inter-frame coded B picture. The respective coded frame data Pbs(1), Pbs(2), Pbs (3), . . . , Pbs(f) cannot be decoded until the corresponding waiting times tb1, tb2, tb3, . . . tbf, i.e., times required for decoding all the coded frame data present before the respective data Pbs(1)–Pbs(f) have elapsed. In other words, the coded frame data Pbs(1)–Pbs(f) to be decoded at the fast forward playback process are reproduced at the same timing when they are reproduced in a normal playback process.
Consequently, if the fast forward playback process is performed to the inter-frame compressed image data Db, the resulting reproduced image of the moving picture becomes reproduced still pictures of the coded frame data Pbs(1)–Pbs(f) which are sequentially displayed at regular time intervals.
The fast rewind playback process cannot be performed to the inter-frame compressed image data Db, since coded frame data of the last frame cannot be reproduced until all the coded frame data has been decoded.
Each of the headers Ha and Hb of the corresponding compressed image data Da and Db contains an identification flag indicating whether or not the corresponding compressed image data is suitable for use in the independent reproduction.
As solutions to the problem associated with trade-off between the efficiency in compressively coding the image data and suitability for the fast forward playback process, the following solutions are conceived.
The first solution is, as shown in FIG. 12, to store the intra-frame compressed image data Da suitable for use in the fast forward playback process and the inter-frame compressed image data Db from which a reproduced image of a high quality is obtained, in a data storage medium M, as the compressed image data of the moving picture. In FIG. 12, reference numerals D1–Dk designate compressed image data corresponding to other moving pictures which contain headers H1–Hk, respectively. The header Ha of the data Da contains a flag indicating that the data Da is well suitable for use in the independent reproduction. The header Hb of the data Db contains a flag indicating that the data Db is less suitable for use in the independent reproduction.
In the fast forward playback process, according to the respective flags contained in the corresponding headers Ha and Hb, the intra-frame compressed image data Da is read from the data storage medium M as the compressed image data of one moving picture. On the other hand, in the normal playback process, the inter-frame compressed image data Db is read from the data storage medium M.
The second solution is to insert plural pieces of coded frame data corresponding to the I pictures into the inter-frame compressed image data Db at intervals shorter than normal intervals. In general, the coded frame data corresponding to the I pictures is inserted into the compressed image data such that two of plural frames reproduced for 0.5 second are the I pictures. This inter-frame compressed image data Db contains the flag indicating that the data Db is suitable for use in independent reproduction. In this case, in the fast forward playback process, according to picture type flags (not shown) added to flames, each indicating that the corresponding coded frame data corresponds to the I picture, only the coded frame data corresponding to the I pictures can be decoded.
The third solution is, because coded frame data corresponding to some of the P pictures is independently reproducible, to add flags indicating this to these coded frame data. Such coded frame data corresponding to some of the P pictures is obtained by coding without reference to image data of another frame like the coded frame data corresponding to the I pictures, although the corresponding picture type flags indicate the “P pictures”. The coded frame data corresponding to these specified P pictures is independently reproducible. Hence, flags indicating that the coded frame data corresponding to these specified P pictures is suitable for use in independent reproduction are added thereto. So, in the fast forward playback process, according to the picture type flags and these independent reproduction suitability flags (not shown), only coded frame data corresponding to the I pictures and the specified P pictures are decoded.
FIG. 11(c) shows a structure of the inter-frame compressed image data which contains the above independent reproduction suitability flags added to the coded frame data corresponding to the specified P pictures.
In inter-frame compressed image data Dc, headers Hc1, Hc2, . . . , Hcf each containing the suitability flag are inserted just before coded frame data Pcs(1)–Pcs(f) corresponding to the specified P pictures (expressed as P′ in the figure), respectively. In the figure, Hc designates a header of the inter-frame compressed image data Dc, and Pc(1)–Pc(n) designate coded frame data of respective frames.
The structures of the headers of the compressed image data Da and Db will be described with reference to FIG. 13. In FIG. 13, for the sake of simplicity, the compressed image data is shown without distinguishing between the intra-frame compressed image data Da and the inter-frame compressed image data Db.
As mentioned previously, the compressed image data D comprises the header H containing data common to respective frames which is placed at the beginning of the data D and the following coded frame data P.
The header H is composed of a synchronous signal Hsd, data common to respective frames Hcd, a flag Hfd relating to suitability for independent reproduction, and alignment data Had for aligning this data.
The compressed image data corresponding to one moving picture thus contains the information (flag) indicating whether or not coded frame data corresponding to all the frames is independently reproducible. When the coded frame data corresponding to all the frames of one moving picture is independently reproducible, the flag has a value indicating that the corresponding compressed image data is well suitable for use in independent reproduction, whereas when one moving picture contains little coded frame data which is independently reproducible, the flag has a value indicating that the corresponding compressed image data is less suitable for use in independent reproduction.
The flag is contained in the header H including common data (data common to respective frames) at the beginning of the compressed image data.
Hereinafter, a description will be given of examples of data alignment in the header of the compressed image data with reference to tables 1–3 shown below. The data shown in the tables 1–3 are continuously aligned in the header in the transmission order.
Placed at the beginning of the header is a synchronous signal 902 indicating the start of the moving picture, which is represented as a unique fixed-length code (32 bits). Following the synchronous signal 902, various types of common data 903–913 common to respective frames are placed. In the common data 903–913, the data 910 is represented by a variable-length code and the data 903–909 and 911–913 are each represented by a code having plural fixed-bit lengths.
Following these common data 903–913, a flag 914 relating to suitability for independent reproduction and alignment data 915 are placed.
The flag 914 indicates whether or not the coded frame data of frames is randomly and independently reproducible. The value “1” of the flag indicates that all the coded frame data of the corresponding compressed image data of the moving picture is independently reproducible, while the value “0” indicates that most of the coded frame data of the corresponding compressed image data is not independently reproducible. The alignment data 915 is used for aligning the synchronous signal 902 through the flag 914.
Following the alignment data 915, placed are data 916 and 917 relating to coded frame data obtained by coding image data corresponding to respective frames of the moving picture. In actuality, these data 916 and 917 include specific data such as DCT coefficients or quantization steps according to MPEG 1, 2, and 4, although these are illustrated as one data group in this illustrated example.
It should be remembered that the header containing such common data is placed at the beginning of the compressed image data of one moving picture. If the inter-frame compressed image data including coded frame data which is not independently reproducible includes some independently reproducible coded frame data (coded frame data corresponding to the I picture) which is arranged periodically, effectiveness is provided by inserting common data containing a flag relating to a possibility of independent reproduction rather than the flag relating to suitability for independent reproduction. The former flag indicates whether or not the corresponding coded frame data is independently reproducible without reference to another frame data.
In the fast forward playback process performed to the inter-frame compressed image data into which such common data is periodically inserted, the independently reproducible coded frame data corresponding to the I picture is selectively decoded.
TABLE 1No. of901~Video Object Layer( ){bitsMnemonic902~video_object_layer_start_code32bslbf/*4least significant bits specify video_object_layer_id value*/is_object_layer_identifier ~ 903a1uimsbfif(is_object_layer_identifier){903 {open oversize brace}  visual_object_layer_verid ~ 903b4uimsbf visual_object_layer_priority ~ 903c3uimsbf}vol_control_parameters1bslbfif(vol_control_parameters) aspect_ratio_info4uimsbf vop_rate_code4uimsbf904 {open oversize brace}  bit_rate30uimsbf vbv_buffer_size18uimsbf chroma_format2uimsbf low_delay1uimsbf}video_object_layer_shape2uimsbf905 {open oversize brace} vop_time_increment_resolution15uimsbffixed_vop_rate1bslbf906~if(video_object_layer_shape!=“binary only”){ if(video_object_layer_shape==“rectangular”){  marker_bit1bslbf  video_object_layer_width13uimsbf907 {open oversize brace}   marker_bit1bslbf  video_object_layer_height13uimsbf } obmc_disable1bslbf
TABLE 2sprite_enable1bslbfif(sprite_enable){ sprite_width13uimsbf marker_bit1bslbf sprite_height13uimsbf marker_bit1bslbf sprite_left_coodinate13simsbf marker_bit1bslbf sprite_top_coodinate13simsbf marker_bit1bslbf no_of_sprite_warping_points6uimsbf908 {open oversize brace}  sprite_warping_accuracy2uimsbf sprite_brightness_change1bslbf if(video_object_layer_shape==“rectangular”){  init_sprite_width13uimsbf  marker_bit1bslbf  init_sprite_height13uimsbf  marker_bit1bslbf  init_sprite_left_coodinate13simsbf  marker_bit1bslbf  init_sprite_top_coodinate13simsbf }}not_8_bit1bslbfif(not_8_bit){909 {open oversize brace}  quant_precision4uimsbf bits_per_pixel4uimsbf}quant_type1bslbfif(quant_type){ load_intra_quant_mat1bslbf if(load_intra_quant_mat)910 {open oversize brace}   intra_quant_mat8*[2–64]uimsbf load_nonintra_quant_mat1bslbf if(load_nonintra_quant_mat)  nonintra_quant_mat8*[2–64]uimsbf}911~complexity_estimation_disable1bslbf
TABLE 3 error_resilient_disable1bslbf if(!error_resilient_disable){912 {open oversize brace}   data_partitioned1bslbf  reversible_vlc1bslbf } scalability1bslbf if(scalability){  ref_layer_id4uimsbf  ref_layer_sampling_direc1bslbf  hor_sampling_factor_n5uimsbf913 {open oversize brace}   hor_sampling_factor_m5uimsbf  vert_sampling_factor_n5uimsbf  vert_sampling_factor_m5uimsbf  enhancement_type1bslbf }}914~random_accessible_vol1bslbf915~next_start_code( ) if(sprite_enable)916 {open oversize brace}   decode_init_sprite( )do{ if(next_bits( )=group_of_vop_start_code)  Group_of_Video object Plane( )917 {open oversize brace}  Video Object Plane( )}while((next_bits( )=group_of_vop_start_code)∥  (next_bits( )=vop_start_code)}
When performing the fast forward playback process or the fast rewind playback process to the compressed image data of the moving picture, the coded frame data is randomly selected from the compressed image data and then decoded, and therefore, it is necessary to quickly decide whether or not the compressed image data is suitable for use in the fast forward playback process or whether or not the coded frame data of the compressed image data is independently reproducible.
However, it is impossible to quickly decide these (suitability for independent reproduction and possibility of independent reproduction) from the headers added to the conventional compressed image data and coded frame data.
In order to decide whether or not the compressed image data is suitable for use in independent reproduction, the flag (data 914 shown in the tables 1–3) in the header containing the common data is extracted and analyzed.
To check whether or not the value of the flag 914 in the header is “1”, it is necessary that all the common data 903–913 placed before the flag 914 is extracted and then analyzed by parsing these common data before the flag 914 is analyzed. For instance, until it has been checked that the value of the common data 903a is “1”, it is impossible to decide whether or not the common data 903b and 903c are present.
In the header added to the conventional compressed image data, various data such as the synchronous signal 902 indicating the start of the moving picture and the common data 903–913 for the coded frame data, are placed before the flag indicating whether or not the compressed image data is suitable for use in the independent reproduction. Much of this common data often serves as a switch or the like. This means that the following data processing depends upon the value of such common data.