Recently, as multimedia applications have been further broadened, it has become more and more commonplace to convert information on every type of medium, including image, audio and text data, for example, into digital data and process it collectively. Among other things, the size of the digital image data (e.g., digital moving picture data, in particular) is so huge that a encoding technique for encoding the moving picture data highly efficiently is required to store and transmit that data. As exemplary encoding techniques that were developed for the purpose of encoding a moving picture, encoding schemes as like MPEG-1, 2 and 4 defined by International Organization for Standardization (ISO), are known.
According to each of these encoding schemes, the input picture data is divided into a number of blocks, and each of those blocks can be encoded selectively by either intra coding (i.e., an encoding mode with no motion compensation) or inter-picture predictive coding (i.e., an encoding mode with motion compensation). Of these two techniques, the motion compensated prediction of the inter-picture predictive coding has a plurality of prediction modes associated with various types of motion vectors for use in the motion compensation. In encoding, an appropriate one of the prediction modes is selected to encode each block.
The prediction modes are roughly classified into the three types of forward predictive coding, backward predictive coding and bidirectional predictive coding according to the prediction direction. And each of these three coding modes is further classifiable into frame predictive coding to be performed on the entire frame of the block to be encoded, field predictive coding to be performed on a first type of (e.g., odd-numbered) fields, and field predictive coding to be performed on a second type of (e.g., even-numbered) fields. That is to say, there can be six prediction modes in all.
FIG. 1 shows a configuration for a conventional moving picture encoder 100, which includes an input picture memory 101, a subtracting section 102, an orthogonal transformation section 103, a quantization section 104, a variable-length encoding section 105, an inverse quantization section 106, an inverse orthogonal transformation section 107, an adding section 108, a reference picture memory 109, a motion detection section 110, a prediction mode determining section 111 and a motion compensation section 112.
The prediction mode for use in the motion compensated prediction of the inter-picture predictive coding is determined mainly by the motion detection section 110 and the prediction mode determining section 111.
The motion detection section 110 detects a similar block region from a specified range of the reference picture data, which is stored in the reference picture memory 109, with respect to the block-by-block input picture data that has been supplied from the input picture memory 101 and outputs a motion vector representing the magnitude of motion and the accumulated error representing the degree of similarity between the blocks. If the six types of prediction modes are all available, motion detection is carried out on the reference picture data that is specified for each of those prediction modes.
The prediction mode determining section ill determines the prediction mode for use in the motion compensation section 112 by the accumulated error associated with the motion vector of the prediction mode that has been detected by the motion detection section 110.
Hereinafter, a conventional method of determining the prediction mode will be described. To detect a block similar to the block to be encoded for a certain prediction mode, the motion detection section 110 calculates the total sum of the absolute values of differences between the block to be encoded and the reference picture block at respective pixel locations as an accumulated error and uses it as an estimated value. Then, the motion detection section 110 uses the magnitude of motion to the location where the accumulated error becomes minimum as the motion vector of that prediction mode and outputs the motion vector along with the accumulated error.
For example, the accumulated error AE (i, j) of one-direction frame prediction, which is one of the prediction modes according to the MPEG standards and a generic term covering forward prediction and backward prediction, may be calculated by the following Equation (1):AE(i, j)=Σ|Y(x, y)−RefY(i+x, j+y)|  (1)
Y (x, y) represents the pixel value at a location (x, y) in the block to be encoded in the input picture data, RefY (i+x, j +y) represents the pixel value at a location (i+x, j+y) within a search range of the reference picture data, and AE (i, j) represents the accumulated error in inter-block matching. The motion vector is defined as the magnitude of motion to the location (i, j) where the accumulated error AE (i, j) is minimized. FIG. 2 shows a relation between the block to be encoded Y and the block at a location (i, j) where the accumulated error AE (i, j) is minimized in reference picture data RefY. The vector directed from a reference point (0, 0) to (i, j) is the motion vector.
The prediction mode determining section 111 compares with each other the accumulated errors AE of the respective prediction modes that have been supplied from the motion detection section 110, selects one of the prediction modes in which the accumulated error AE is the minimum, and determines it as a prediction mode for use in motion compensation. This method is based on a common conception that the smaller the accumulated error, the smaller the distortion caused by encoding and the generated code size would normally be.
Various other methods for determining a prediction mode have been proposed. For example, Patent Document No. 1 discloses a method in which the bidirectional prediction mode is prohibited and only the forward or backward prediction mode is selected if the ratio of the target bit rate for encoding to the size of the input picture (i.e., the number of pixels) becomes equal to or greater than a predetermined threshold value.
The method of Patent Document No. 1 is proposed to overcome the problem that if the target bit rate is low, the code size becomes too big for the motion vector and too small for the predicted error picture data to avoid eventual deterioration in image quality. As the code size in the bidirectional prediction mode should be greater than that in the forward or backward prediction mode, the bidirectional prediction mode is prohibited according to this method if the ratio described above becomes equal to or greater than a predetermined threshold value.
Examples of main pieces of block-by-block information contained in an MPEG encoded bit stream include a piece of information indicating whether intra coding or inter-picture predictive coding should be carried out, the prediction mode and motion vector for the inter-picture predictive coding, encoded picture data (i.e., input picture data in cases of intra coding and encoded data for predicted error picture data in cases of inter-picture predictive coding) and quantization scale. Among these pieces of information, the encoded picture data generated by quantization can have its code size controlled dynamically by changing the settings of the quantization scale.
As this method is irreversible compression involving quantization, however, it is important how to reduce the encoding noise while cutting down the generated code size. Generally speaking, if encoding is performed at a sufficiently high target bit rate for the picture size of input picture data, the percentage of the motion vector to the overall encoded data is low. That is why the generation of encoding noise can be minimized with only the quantization distortion and the generated code size considered for the predicted error picture data.
Patent Document No. 2 discloses another exemplary method for determining the prediction mode. Specifically, according to the method of Patent Document No. 2, a table of code sizes for respective elements of predicted error picture data and motion vectors is drawn up in advance, the code size, as well as the predicted error picture data and the motion vector, is calculated in every prediction mode, and a prediction mode that would have an appropriate code size is selected. According to this method, the prediction mode can be selected not just when the target bit rate is particularly low.                Patent Document No. 1: Japanese Patent Application Laid-Open Publication No. 2000-13802        Patent Document No. 2: Japanese Patent Application Laid-Open Publication No. 9-322176        