1. Field of the Invention
The present invention relates to an image processing apparatus. The present invention relates to technology in which to process data coded according to, for example, the MPEG (Moving Picture Expert Group) standards.
2. Description of the Related Art
Information handled in the multimedia is of a vast amount and is multifarious, thus a fast processing of the information is necessary in the course of planning to put the multimedia to practical use. In order to process information at high speed, compression and expansion of data become indispensable. The “MPEG” method is one of data compression and expansion techniques. This MPEG method is being standardized by the MPEG Committee (ISO/IEC JTC1/SC29/WG11) under the ISO (International Organization for Standardization)/IEC (International Electro-technical Commission). An image processing apparatus utilizing the MPEG method is built into various image related devices such as movie camera, still camera, television set, video CD reproduction equipment, DVD reproduction equipment and so forth.
Video data handled in the MPEG relate to moving pictures, and the moving picture is constituted by a plurality of, for example, 30 frames, per second that are still pictures or frames. As shown in FIG. 1, the video data have a hierarchical structure and are comprised of six layers that are a sequence layer, a GOP (group of pictures) layer, a picture layer, a slice layer, a macroblock layer and a block layer in this order. The number of the slices constituting a single picture is not fixed, and the number of the macroblocks constituting a single slice is also not fixed. The macroblock layer and the block layer are omitted in FIG. 1.
Moreover, MPEG is chiefly classified under two methods, namely, MPEG-1 and MPEG-2, according to the coding rates. In MPEG-1, a frame corresponds to a picture. In MPEG-2, however, frames or fields can also be made to correspond to pictures. Two fields constitute one frame. The structure where frames correspond to pictures is called a frame structure, whereas the structure where fields correspond to pictures is called a field structure.
In MPEG, a compression technique called inter-frame prediction is employed. The inter-frame prediction compresses inter-frame data based on a temporal correlation among frames. In the inter-frame prediction, bidirectional prediction is performed. The bidirectional prediction uses both forward prediction for predicting a current reproduced image from a past reproduced image or picture, and backward prediction for predicting a current reproduced image from a future reproduced image.
This bidirectional prediction uses three types of pictures that are I picture (Intra-Picture), P picture (Predictive-Picture) and B picture (Bidirectionally predictive-Picture). An I picture is an image independently produced by an intra-frame coding processing, irrespective of past and future reproduced images. In order for a random access to be performed, at least one I picture is needed within the GOP. All of the macroblock type within the I picture are intra-frame prediction pictures (Intra-Frames). A P picture is produced by the inter-frame coding processing using the forward prediction that is prediction from a past I or P picture. The macroblock type in the P picture includes both an intra-frame prediction picture and a forward prediction picture (Forward Inter Frame).
The B picture is produced by the inter-frame coding processing using the bidirectional prediction. In the bidirectional prediction, a B picture is produced by one of the following three predictions.
(1) Forward Prediction; prediction from a past I picture or P picture.
(2) Backward Prediction; prediction from a future I picture or P picture.
(3) Bidirectional Prediction; prediction from past and future I picture or P picture.
The macroblock type in the B picture includes four types of pictures that are an intra-frame prediction picture, a forward prediction picture, a backward prediction picture (Backward Inter Frame), and an interpolative prediction picture (Interpolative Inter Frame).
These I, P and B pictures are respectively coded. Namely, the I picture can be generated even when no past or future picture is available. In contrast thereto, the P picture can not be generated without the past picture, and the B picture can not be generated without the past or future pictures. However, when the macroblock type is the interpolative prediction picture for the P and B picture, the macroblock is produced even without the past or future pictures.
In the inter-frame prediction, an I picture is periodically produced first. Then, a frame several frames ahead of the I picture is produced as a P picture. This P picture is produced by prediction in one direction from the past to the present, namely, in the forward direction. Thereafter, a frame located before the I picture and after the P picture is produced as a B picture. When producing this B picture, the optimal prediction method is selected from among the three prediction methods which are the forward prediction, backward prediction and bidirectional prediction. In general, a current image and its preceding and succeeding images in consecutive motion pictures are similar to one another, and they differ only partially. Thus, it is assumed that the previous frame and the next frame are substantially the same. If there is a difference between the two frames, that difference only is extracted and compressed. For example, if the previous frame is an I picture and the next frame is a P picture, the difference is extracted as B picture data. Thereby, the inter-frame data can be compressed based on the temporal correlation among frames. A data sequence or a bit stream of video data coded in compliance with the MPEG video part is called an MPEG video stream.
MPEG-1 is designed mainly for storage media such as video CD (Compact Disc) or CD-ROM (CD Read Only Memory). MPEG-2, on the other hand, is designed not only for storage media such as video CD, CD-ROM, DVD (Digital Video Disk) and VTR (Video Tape Recorder) but also for transmission media in general including communication media such as LAN (Local Area Network) and broadcast media such as the ground wave broadcast, satellite broadcast and CATV (Community Antenna Television).
A core of technology used in the MPEG video part lies in a motion compensated prediction (MC) and a discrete cosine transform (DCT). The coding technique combining MC and DCT is called the hybrid coding technique. The DCT (also referred to as a FDCT (forward DCT)) is utilized in the MPEG video part at the time of the coding, so that video signals of the images are decomposed into frequency components so as to be processed. Thereafter, at the time of a decoding, the frequency components are again restored to video signals by using an inverse discrete cosine transform (inverse DCT or IDCT).
The MPEG can process a vast amount of information at high speed and the MPEG uses the compression technique called the inter-frame prediction as described above. Thus, it is extremely difficult to reverse-reproduce a data sequence, for the purpose of a picture search, which is coded and recorded in a time series manner according to the MPEG, namely, it is extremely difficult to reproduce a recorded data series by simply going back along the time axis in the case of reproduction in the reverse direction as in the usual video tape recorder. Thus, it is conventionally performed that I picture alone allotted in each GOP is reproduced by going back along the time axis. Since the I picture is an image produced by the intra-frame coding processing as described above, it can be independently displayed without referring to pictures before and/or after it.
In the conventional examples, the number of the I picture allotted for each GOP is very small. For example, the number of the I picture allotted for each GOP is at most one among pictures constituting the GOP, and when the picture of per 15 to 30 frames is reverse reproduced, a smooth reverse-reproduced picture as in the usual video tape recorder is not obtained, so that it is difficult to stop at a desired scene at a proper timing.