1. Field of the Invention
The present invention relates to a video coding apparatus and method for encoding moving pictures in a compressed format according to an international standard such as ISO/IEC 13818-2, known as MPEG-2.
2. Description of the Related Art
The MPEG-2 standard defines three picture types: intra-coded pictures (I-pictures), predictive coded pictures (P-pictures) and bi-directionally predicted pictures (B-pictures). I-pictures are coded in such a way that they can be decoded without knowing anything about other pictures in a video sequence. The first picture in a group of pictures is always an I-picture and provides key information for pictures that follow. P-pictures are coded (i.e., forward predictive coded) by using information from a reference picture displayed earlier, which may be either an I-picture or a P-picture. B-pictures also use information from pictures displayed earlier and from pictures coming in the future (i.e., forward-and-backward predictive coded). These three picture types cyclically occur in a predetermined pattern. According to the current practice, I-pictures occur at intervals represented by an integer N and the interval between an I-picture and a P-picture is represented by an integer M. Since these integers are of fixed value, the video sequence is dynamically controlled so that they are maintained constant.
As shown in FIG. 1A, when M=1, an I-picture is followed by a sequence of P-pictures. Each P-picture is coded by using information from a picture that immediately precedes it. For M=2 (FIG. 1B), the interval between an I-picture and a P-picture is equal to “2” and a B-picture comes in between. In this case, each P-picture is coded by using information from a picture preceding it by two-picture interval and each B-picture is coded using information from two pictures, one immediately preceding it and the other immediately succeeding it. For M=3 (FIG. 1C), the interval between an I-picture and a P-picture is equal to “3” and two B-pictures are used to fill in the interval. In this case, each P-picture is coded by using information from a picture preceding it by three-picture interval, and one of the two B-pictures is coded using information from an immediately preceding picture and from a future picture that comes two-picture intervals following it, and the other of the two B-pictures is coded using information from a previous picture that precedes it by two-picture intervals and from a picture immediately following it. Thus, for M≧2, the number of B-pictures that come in between non-B-pictures is equal to M−1.
One reason for using the B-picture is to reduce the amount of redundant video information inherently contained in the original frame. For a given quantization scale, the use of B-pictures can reduce the number of codes with which original pictures are encoded. Hence, the picture quality can be improved for a given compression (coding) rate. Another reason for using the B-picture is its tendency toward cancelling an accumulated error that will result from continued prediction coding processes that use information only from previous pictures of “parent generations” which themselves were predicted from reference pictures of “grandparent generations”. Therefore, if unidirectional (forward) predictive coding were exclusively used, predictive coded “generations” would increase rapidly with time and quantization errors would accumulate significantly. B-pictures present a solution to this problem.
Although the B-picture provides a benefit, the use of many B-pictures (with the resultant increase in the M-value) is disadvantageous for fast-moving pictures since it becomes difficult to search for motion vectors within a range that is considered appropriate. Consider, for example, an object moving at a constant velocity. Since the amount of motions for each frame is constant, an increase in the M-value would cause the moving object to proportionally increase its range of motions. In order to precisely search for motion vectors, it would be necessary to perform a vector search over a wide range that is variable in proportion to the M-value.
One prior art approach involves setting a maximum value of per-frame motions and then determining a range of motion vectors to be searched for that is M times the maximum value. However, a significant amount of hardware is necessary to implement this approach. Although the hardware problem can be avoided by the use of an algorithm that simplifies motion vector search, this would be only achieved at the cost of search precision and a poor picture quality would result.
Another prior art approach is disclosed in Japanese Laid-Open Patent Application 9-294266. According to this technique, a distribution of motion vectors and a differential value of inter-frame predictions are detected. The M-value is increased according to the detected distribution and is decreased according to the detected differential value. Therefore, if a motion-vector search is being performed on a current P-picture using M=2 over a given range and most of the motion vectors are found to exist in that given range, then the M-value is incremented to 3 and a picture that is three frame intervals future from the current P-picture is determined as the next P-picture. Otherwise, the M value remains unchanged and a picture that is two frame intervals future from the current P-picture is determined as the next P-picture. If the detected differential value of inter-frame predictions exceeds some threshold, the M-value is decremented to 1 and a picture that is one frame interval future from the current P-picture is determined as the next P-picture. However, it is established that, in most cases, the distribution of motion vectors is isotropic about an average vector and its spread (variance) varies depending on the strength of auto-correlation of motions. Therefore, statistical data of motion vectors cannot be estimated by the number of motion vectors which exist in a search range and exceed a threshold value. If motion vectors have a large mean value in the neighborhood of a threshold within a given range that is considered sufficient for a search regardless of their variance, the narrowing of the search range would cause a significant degradation of picture quality. If the distribution of motion vectors is used for making a decision for the adequacy of the search range and if the algorithm for such decision is based solely on a motion vector distribution approaching a zero vector point, a decision is likely to be made in favor of the adequacy of the search range. When the distribution immediately moves away from the zero vector point, it can occur that the search range will be found to be insufficient. Therefore, several frames would be taken to readjust the interval between successive P-pictures. A delayed action will cause poor picture quality.