The present invention relates to a video compression and encoding technology such as an MPEG scheme utilized in a video transmission system and a video database system through the Internet or the like. The present invention particularly relates to a video encoding method and a video encoding apparatus capable of providing a unified decoded video for each scene, which is easy to see without increasing data size, by encoding data in accordance with encoded parameters based on the content of scenes.
The MPEG scheme, which is an international standard for video encoding, is a technique for compressing a video by a combination of motion compensation prediction, discrete cosine transformation and variable length coding, as is well known. The MPEG scheme is described in detail in, for example, Reference 1: “MPEG”, The Institute of Television Engineers edition, Ohmsha, Ltd.).
In a conventional video encoding apparatus based on the MPEG scheme, compressed video data is transmitted by a transmission line the transmission rate of which is specified, or recorded on a storage medium the recording capacity of which is limited. Owing to this, a processing referred to as rate control for setting encoding parameters, such as a frame rate and a quantization width, and conducting encoding so that the bit rate of an outputted encoded bit stream can become a designated value. In conventional rate control, a method of determining a frame rate according to the number of generated bits as a result of encoding a previous frame with respect to a fixed quantization width has been often adopted.
Conventionally, a frame rate is determined based on the difference (margin) between a present buffer capacity and a frame skip threshold preset according to the capacity of a buffer in which an encoded bit stream is temporarily stored. If the buffer capacity is lower than the threshold, data is encoded at a fixed frame rate. If the buffer capacity is higher than the threshold, frame skipping is conducted to decrease the frame rate.
With this method, however, if the number of coded bits generated in a previous frame is large, frame skipping is conducted until the buffer capacity becomes not more than the frame skip threshold. Due to this, the distance between the frame and the next frame becomes too wide, with the result that video disadvantageously becomes unnatural.
That is, according to the conventional rate control, the frame rate and the quantization width are basically set irrespectively of the content of a video. For that reason, frame rate become too low on a scene in the video on which an object moves actively and the motion of the object becomes unnatural. Besides, due to the inappropriate quantization width, the picture may be distorted to thereby disadvantageously find it difficult to visually recognize the picture.
In the meantime, there is also known a rate control method based on a technique referred to as two-pass encoding. This technique is described in, for example, Reference 2: Japanese Patent Unexamined Application Publication No. 10-336675. As described in Reference 2, a video file is encoded twice, the overall characteristics of the video file is analyzed by the first encoding, the second encoding is conducted by setting appropriate encoding parameters based on the analysis result and an encoded bit stream obtained as a result of the second encoding is transmitted or recorded. The two-pass encoding, however, has the same problems as those described above since encoding parameters are conventionally, basically set irrespectively of the contents of a video.
As stated above, in the conventional video encoding apparatus, encoding parameters such as the frame rate and the quantization width are set irrespectively of the contents of a video when conducting rate control. Due to this, the frame rate suddenly decreases on a scene in the video on which an object moves actively and the motion of the object becomes unnatural. Also, due to the inappropriate quantization width, the video may be distorted. Thus, the conventional video encoding apparatus has a disadvantage in that the deterioration of picture quality tends to be conspicuous.