1. Field of the Invention
The present invention relates to encoding of video images and more particularly to implementing features to an encoder which improves the quality of the encoded video images.
2. Description of the Related Arts
As the Internet becomes more and more popular, more and more kinds data are being transferred using the Internet. The internet and other channels of communication have bandwidth limitations. Data compression is often used to maximize data transmission over such limited bandwidth channels. Most people access the Internet using fixed rate channels such as telephone lines. The fixed rate channels present problems for viewing video. Typically, the transfer of video images require high bandwidth channels. However, compression techniques have reduced the need for the high bandwidth channels but at the expense of choppy low quality video images.
Thus, particularly in low bitrate communication, image quality and encoder performance are still in need of improvement to achieve the quality of broadcast or real-time video at the approximate 30 frames per second. Typically, in any video film clip there are many instances when sequential picture frames are very similar from one frame to the next. Digitizing each frame and comparing the two dimensional digital arrays result in samples which are highly correlated. In particular, adjacent samples within a picture are very likely to have similar intensities. Exploiting this correlation and others within each picture and from picture to picture enables encoders to compress picture sequences more effectively.
The modern encoders for encoding video images possess the intelligence which takes advantage of the many correlations between the pictures of a video sequence. Decoders on the other hand follow directions already encoded in the bitstream by the encoders and thus are relative simple compared to the encoder. During encoding, the encoders identify areas in motion, determine optimal motion vectors, control bitrate, control data buffering such that underflow and overflow do not occur, determine where to change quantization, determine when a given block can simply be repeated, determine when to code by intra and inter techniques, and vary all of these parameters and decisions dynamically so as to maximize quality for a given situation. However, even the modem encoders still do not provide the intelligence necessary to produce smooth video at low communication bitrates.
Therefore, it is desirable to provide an encoding apparatus and methods of operating the same which more intelligently manipulates correlations between individual pictures.
The present invention provides an apparatus for advanced encoders and methods for operating the same which result in improved image quality of encoded images. The novel video encoder is based on identifying particular properties of input images and refining the coding techniques of the encoding engine to reflect the identified properties of the input images. Thus, according to one aspect of the invention, the video encoder for encoding input images having a plurality of data blocks to provide compressed image data comprises DCT (discrete cosine transformer) resources configured to DCT the data blocks. Quantizing resources is coupled to the DCT resources configured to quantize the data blocks to provide quantized data blocks. Inverse quantizing resources is coupled to the quantizing resources to inverse quantize the quantized data blocks. Frame reconstruction resources is coupled to the inverse quantizing resources configured to reconstruct previous compressed frames. Motion estimation resources is coupled to the frame reconstruction resources configured to provide predicted data blocks. Subtraction resources is coupled to the DCT resources and the motion estimation resources to subtract the data blocks and the predicted data blocks. An output data buffer is coupled to the quantizing resources configured to provide a data rate signal to the quantizing resources for modifying quantizer values of the quantizing resources in order to maintain a particular target output data rate of the compressed image data.
According to another aspect of the invention, the video encoder further comprises image preclassifying resources coupled between the subtraction resources and the DCT resources configured to preclassify the data blocks as active and inactive regions wherein the quantizing resources responsive to preclassification of the data blocks limits quantizer values for the active regions. The combination of the preclassifying resources and the quantizing resources produces variable rate coding which affords constant image quality at variable data rates. Because the active regions are coded with a relatively fixed quantization and the inactive regions are coded with larger quantization values, the actives regions produce better image quality than the inactive regions, particularly in situations where data rates are reduced.
According to another aspect of the invention, the frame reconstruction resources includes an automatic scene change detector configured to determine whether to code inverse quantized data blocks as intra frames or predicted frames. The automatic scene change detector includes a scene similarity detector configured to compare an current frame with a previous frame to determine similarity between the previous frame and the current frame. A frame comparator is configured to provide a combination of distortion, differences in luminance, and color histogram information for comparing the previous frame with the current frame. The scene similarity detector directs the current frame to be encoded as an intra frame when the combination of distortion, differences in luminance, and color histogram information exceed an adaptively determined threshold.
According to yet another aspect of the invention, the frame reconstruction resources includes a reference picture controller configured to determine whether to code inverse quantized data blocks based upon a reference frame or a previous frame. The reference picture controller includes a frame comparator configured to compare a previous frame and a reference frame with a current frame to determine whether the previous frame or the reference frame is more similar to the current frame, and a frame encoder coupled to the frame comparator configured to encode the current frame based on a selected more similar frame from the frame comparator.
According to another aspect of the invention, the reference picture controller includes a reference picture store coupled to the frame encoder configured to receive updates of additional background information for the reference frame from the reference frame comparator. For instance, whenever the automatic scene change detector directs the current frame to be encoded as an intra frame, the reference picture controller updates the reference picture store to include the intra frame.
According to yet another aspect of the invention, the reference picture controller includes a synthetic background generator which generates a synthetic background as the reference frame for encoding the current frame. The synthetic background generator includes animation by a java applet. Moreover, to conserve bandwidth, the frame reconstruct resource codes foreground regions of images and use the synthetic background as the reference image.
According to yet another aspect of the invention, the motion estimation resources constrains motion vectors to be smooth relative to each other. The motion estimation resources includes a motion vector search engine configured to receive a previous frame and a current frame to search an optimal motion vector, a motion vector biasor coupled to the motion vector search engine configured to bias the optimal motion vector to favor a direction consistent with that found in surrounding areas of the optimal motion vector and provide a modified distortion, and a signal to noise ratio (SNR) comparator configured to compare SNR of additional motion vector searches performed by the motion vector search engine in a direction consistent with the optimal motion vector with the modified distortion to select the motion vector associated with minimum distortion. By constraining the motion vectors to be smooth relative to each other, the motion estimation resources extracts zoom information from a zooming image instead of having to encode the entire zooming image. The overall amount of data generated by the encoder is reduced.
An apparatus and method of operating an advanced encoder are provided whereby the encoder engine provides variable rate coding, automatic scene change detection, reference picture determination and update, and motion vector smoothing. The variable rate coding maintains a constant image quality at a variable data rate, while the automatic scene change detection determines when input frame are coded as intra frames based on a combination of distortion, differences in luminance, and color histogram measurements from frame encoding to determine similarity between temporally adjacent frames. The reference picture determination choose the reference picture or the previous picture as the bases for encoding an input image. The motion vector smoothing preferentially biases the motion vectors so that the overall motion vector field is more smooth than it would otherwise improving the quality of motion estimation from one frame to another.
Other aspects and advantages of the present invention can be seen upon review of the figures, the detailed description, and the claims which follow.