MPEG-1, MPEG-2 and so on are known as standards for coding moving picture data with high efficiency. In the MPEG-1 and MPEG-2, moving picture data is coded by an intra-picture coding method, a forward predictive coding method or a bidirectional predictive coding method appropriately selected.
When coded by such a moving picture coding technique, a moving picture is often a mixture of pictures that have been compression and coded (compression-coded) by the intra-picture coding method (which will be referred to herein as “I-pictures”), pictures that have been compression-coded by the forward predictive coding method (which will be referred to herein as “P-pictures”) and pictures that have been compression-coded by the bidirectional predictive coding method (which will be referred to herein as “B-pictures”). An I-picture is coded using only the data in the picture itself and without doing any temporal prediction. A P-picture is predictively coded by reference to an I- or P-picture, which is located before the P-picture. And a B-picture is predictively coded by reference to I- and P-pictures, which are located before and after the B-picture. The pictures to be referred to are called “reference pictures”. The reference pictures for use in prediction are determined according to the type of the given picture.
FIG. 1 shows a moving picture data prediction scheme using bidirectional prediction. In FIG. 1, I, P and B denote an I-picture, P-pictures and B-pictures, respectively. In the prediction scheme shown in FIG. 1, the order of coding is I1, P4, B2, B3, P7, B5 and B6. In FIG. 1, the picture I1 is intra-picture coded. The picture P4 is forward predictive coded by reference to the picture I1. Each of the pictures B2 and B3 is bidirectionally predictive coded by reference to the pictures I1 and P4. The picture P7 is forward predictive coded by reference to the picture P4. Each of the pictures B5 and B6 is bidirectionally predictive coded by reference to the pictures P4 and P7.
The I-, P- and B-pictures are usually arranged periodically. FIG. 2 shows an arrangement of I-pictures, P-pictures and B-pictures. In general, I-pictures are arranged every N frames, P-pictures are arranged every M frames between two I-pictures, and (M-1) B-pictures are provided between the I-picture and the following P-picture and between that P-picture and the following P-picture. FIGS. 3(a), 3(b) and 3(c) show correspondence between the order in which respective types of pictures of the moving picture data are input and the order in which those pictures are coded in three situations where M=1, M=2 and M=3, respectively.
As shown in FIG. 3(a), if M=1, then the moving picture consists of only I- and P-pictures and includes no B-pictures. Accordingly, the respective pictures in the incoming moving picture are coded without changing the order at all, thus causing no processing delay during the coding process. However, if M=2, then one B-picture is present between the I-picture (or a P-picture) and the next P-picture as shown in FIG. 3(b). In that case, a processing delay of one frame is caused before each B-picture starts being coded. The reason is as follows. Specifically, no B-picture is allowed to start being coded until its reference pictures (i.e., I- and P-pictures), located before and after the B-picture, have been coded. Accordingly, the B-picture must be coded in a different order from that of the incoming pictures.
If M=3, then two B-pictures are present between the I-picture (or a P-picture) and the next P-picture as shown in FIG. 3(c). In that case, for the same reason as that described for FIG. 3(b), a delay of two frames is caused before each B-picture starts being coded.
The B-pictures are used because the efficiency of prediction can be increased by adopting the bidirectional prediction technique as a combination of the forward and backward predictions. Also, unlike the I- or P-picture, no B-picture will be used as a reference picture in subsequent predictive coding. Thus, the error caused during the predictive coding process never propagates. Accordingly, even if the B-picture is coded at a lower amount of data than the I- or P-picture, the deterioration in image quality should be much less visible. However, when B-pictures are used, the reference picture interval M for the P-pictures to be forward predicted increases due to the insertion of the B-pictures. Accordingly, the prediction tends to be particularly inaccurate for a moving picture with swift motion, among other things.
In view of these considerations, the coding efficiency can be improved by dynamically switching the reference picture interval M for forward prediction in accordance with the property of the given moving picture data.
Conventional techniques of coding with the reference picture interval M for forward prediction switched dynamically are disclosed in Japanese Laid-Open Publication No. 9-294266, Japanese Laid-Open Publication No. 10-304374, and Japanese Laid-Open Publication No. 2001-128179, for example.
Japanese Laid-Open Publication No. 9-294266 describes a technique of scaling motion vector of a coded frame and controlling the reference picture interval M such that its magnitude falls within the motion search range of the frame to be coded next.
Japanese Laid-Open Publication No. 10-304374 describes a technique of estimating the inter-frame prediction efficiency by using a prediction error or activity obtained from a coded block and controlling the reference picture interval M according to this prediction efficiency.
Japanese Laid-Open Publication No. 2001-128179 describes a technique of estimating the inter-frame predictability by using the coding rates or complexity of coding of the respective types of pictures and controlling the reference picture interval M according to this predictability.
In an interlaced moving picture in which one frame picture is composed of two field pictures, the reference pictures can be switched and the coding efficiency can be increased not just by switching the reference picture intervals M but also by changing the picture structures. The picture structure is a coding unit. Thus, either a frame structure or a field structure may be selected for each picture to be coded. Specifically, if the frame structure is selected as the picture structure, the coding is carried out on a frame-by-frame basis. On the other hand, if the field structure is selected, then the coding is carried out based on first and second field pictures that make up one frame.
In the following description, a field picture to be intra-picture coded will be referred to herein as an “I-field”, a field picture to be forward predictive coded will be referred to herein as a “P-field”, and a field picture to be bidirectionally predictive coded will be referred to herein as a “B-field”. Also, by looking at the type of the first field picture, a frame of which the first field is an I-field will be referred to herein as an “I-frame”, a frame of which the first field is a P-field will be referred to herein as a “P-frame”, and a frame of which the first field is a B-field will be referred to herein as a “B-frame”.
FIGS. 4(a), 4(b) and 4(c) shows relationships between picture types and reference pictures according to the field structure. Specifically, FIG. 4(a) shows I-frames, FIG. 4(b) shows a P-frame, and FIG. 4(c) shows a B-frame. In the I-frame shown in FIG. 4(a), either a type of which the first and second field pictures are both I-fields or a type of which the first and second field pictures are an I-field and a P-field, respectively, is selected. If the second field picture is a P-field, then the first field picture of the same frame is used as the reference picture. In the P-frame shown in FIG. 4(b), the first field picture thereof uses the previously coded I- or P-field as the reference picture of the predictive coding. On the other hand, the second field picture may use the first field picture of the same frame (i.e., the previous field picture) as the reference picture. As a result, the reference picture interval for the second field picture is just one field, thus increasing the prediction efficiency significantly for a picture with a fast motion, in particular. In the B-frame shown in FIG. 4(c), each of the first and second field pictures uses the I- and P-field of the previous and succeeding frames as the reference pictures of the predictive coding.
Recently, a technique of realizing compression coding of good quality more efficiently even if the moving picture has a particularly fast motion is in high demand. To achieve this object, however, the motion velocity of a moving picture needs to be judged as high or low and the coding control needs to be improved so as to maintain a good quality and reduce the data size, both of which have been satisfied only insufficiently by the prior art so far.
An object of the present invention is to estimate the motion velocity of a moving picture more accurately in compression-coding moving picture data and to realize highly efficient compression coding of good quality by dynamically switching the coding methods and coding units even if a picture with a fast motion is included.