The present invention relates to a vide encoding apparatus and method for use in a video transmission system or a video data system using the Internet, and in particular, to a video encoding apparatus and method that can use a two-pass encoding method to carry out encoding using encoding parameters depending on the contents of scenes, to provide an easy-to-see decoded video that is coordinated for each scene without the need to increase a data size.
The MPEG method used to compress a video compresses data by subjecting error signals in motion compensation between frames of video data to discrete cosine transform (DCT) and quantizing relevant coefficients.
A conventional video encoding method based on the MPEG method executes a process called “rate control” to transmit compressed video data via a transmission channel for which a transmission rate is defined or to record the data in a storage medium having a limited recording capacity. In this rate control, an encoding parameter such as a frame rate or a quantization step size is set so that an output encoded bit stream has a specified bit rate, and encoding is executed based on this parameter.
Many rate control methods determine an interval between the current frame and the next frame and the quantization step size of the next frame depending on the number of generated bits for the preceding frame. Thus, the number of generated bits increases for scenes having significant motions within a screen, thereby rapidly degrading video quality. FIG. 10A shows a conventional rate control. It sets a fixed target bit rate as shown at 401 and a fixed frame rate as shown at 403. In addition, an actual bit rate is shown at 402, and an actual frame rate is shown at 404.
In the conventional rate control, the frame rate is determined based on a difference (available capacity) between a buffer size for a preset frame skip threshold and the current buffer level. When the current buffer size is smaller than the threshold, encoding is carried out with the fixed frame rate. When the current buffer size exceeds the threshold, the frame rate is reduced. Thus, when the scene switches to one having significant motions, the number of generated bits increases rapidly to cause a frame skip as shown in FIG. 11 to reduce the frame rate as shown at 404.
Thus, in the conventional rate control, the number of generated bits is specified regardless of the contents of the video. Consequently, in scenes having significant motions within the screen, the frame interval increases excessively to make the motions unnatural or an inappropriate quantization step size contributes to distorting the video, resulting in a failure to provide viewers an easy-to-see video.
On the other hand, a known rate control system uses a method called “two-pass encoding”. Many approaches, however, focus only on variations in the number of generated bits, and only special methods such as shade-in shade-out (Jpn. Pat. Appln. KOKAI Publication No. 10-336641) take the relationship between the contents of the video and the number of generated bits into consideration.
As described above, in the conventional video encoding apparatus, since the frame rate and the quantization step size are determined regardless of the contents of videos, the video quality may be significantly degraded; for example, the frame rate may decrease rapidly in scenes where objects move significantly or an inappropriate quantization step size may contribute distorting the video.