In a digital video system such as video-telephone, teleconference or high definition television system, a large amount of digital data is needed to define each video frame signal since a video line signal in the video frame signal comprises a sequence of digital data referred to as pixel values. Since, however, the available frequency bandwidth of a conventional transmission channel is limited, in order to transmit the substantial amount of digital data therethrough, it is inevitable to compress or reduce the volume of data through the use of various data compression techniques, especially, in the case of such low bit-rate video codec (coding-decoding) systems as video-telephone and teleconference systems.
One of such methods for encoding video signals for a low bit-rate encoding system is the so-called segmentation-based encoding technique.
In the segmentation-based encoding technique, an input video signal of a current image frame is first converted into a plurality of segmented regions based on the luminance levels of pixels included in the current image frame. One of the most widely used image segmentation techniques is a K-means algorithm, wherein each of the pixels is mapped into one of a predetermined number of representative luminance levels which yields a minimum error therebetween, thereby providing a segmented current image including a plurality of segmented current regions, each of which having one of the representative luminance levels.
Thereafter, a mean value of the original luminance levels of pixels included in each of the segmented current regions is calculated. Each of the predetermined representative luminance levels mapped on each of the segmented current regions is then updated with its corresponding calculated mean value, to thereby provide updated mean values. Such mapping and updating processes are sequentially repeated with respect to the original input image until a difference between each of the newly updated mean values and its previous updated mean value is smaller than a predetermined threshold value.
After the image segmentation, each of the segmented current regions of the segmented current image is motion estimated with respect to segmented previous regions included in its previous segmented image. That is, differences between a finally updated mean value of each segmented current region and that of each of the segmented previous regions are calculated first to select one of the segmented previous regions which yields a minimum difference. Thereafter, motion information for each segmented current region is determined, wherein the motion information represents the selected segmented previous region.
Finally, the determined motion information for each segmented current region together with contour and difference information thereof are encoded. There are two types of information constituting the contour information: shape and location. The shape information refers to the form of each contour, whereas the location information deals with the position of each contour within the image. And as the difference information, a difference value between a finally updated mean value of each segmented current region and that of the selected segmented previous region is encoded.
Since, however, in the conventional segmentation-based encoding technique, fluctuation ranges of the luminance levels employed to derive the segmented current regions are generally larger than those of the chrominance levels between the current and its previous images, the determined motion information for each segmented current region may be imprecise, which may, in turn, degrade the picture quality of the encoded video signal.