1. Field of the Invention
The present invention relates in general to coding a video signal in a desired unit, and more particularly to a method and an apparatus for predictively coding shape information of the video signal, which compare shape information of the current image with that of the previous image to obtain a difference therebetween and code the shape information of the current image only when the obtained difference exceeds a predetermined reference value. For example, the present invention is applicable to a shape information coding method of the moving picture experts group-4 (referred to hereinafter as MPEG-4) which is an international standard on moving pictures and audio coding, and other image coding methods considering shape information.
2. Description of the Prior Art
Conventionally, in coding a moving picture in the unit of object, shape information are transmitted together with motion information beginning with that having the highest priority, for the prediction of motion compensation. At this time, different motion information must be applied to adjacent pixels on the object boundary.
Several approaches to representing such an area boundary have been proposed in fields such as computer graphics, character recognition, object synthesis, etc. For example, such approaches may be chain coding, polygon approximation and spline approximation. However, such approaches do not consider transmission. In this connection, it is difficult to transmit coded shape information of a motion region of each frame because of a high transmission rate.
A contour predictive coding method has been suggested to solve the above problem. A high redundancy is present between shape information of a motion region of the same object in successive images. On the basis of such a characteristic, the contour predictive coding method is adapted to perform motion compensation prediction of a contour and transmit the predicted error to reduce a shape information transmission amount. Shape information of a motion region of the same object on successive frames are very analogous in form and position. As a result, the current shape information can be predicted on the basis of the previous shape information. Further, motion information of a moving object is estimated and motion compensation prediction is performed with respect to shape information according to the estimated motion information. In the case where the motion region extraction and motion information estimation are ideally accurate, the transmission of shape information is not necessary.
However, in the above-mentioned contour predictive coding method, the shape information becomes more important as the transmission rate becomes lower. In this connection, an efficient coding method is required to significantly reduce shape information to obtain a higher coding gain than that of a block-unit coding method requiring no transmission of shape information.
In order to solve the problem with the above-mentioned contour predictive coding method, there has been proposed a thresholding operation method selecting a transmission prediction error, which is disclosed in U.S. patent application Ser. No. 08/478,558, filed in the name of Hyundai Electronics Industrious Co, Ltd. The thresholding operation method does not transmit information having no effect on the human's eyesight, or information having no effect on the subjective picture quality, so as to make coding at a low transmission rate possible.
A binary image representing region/non-region or the boundary thereof may be indicated by a contour, but a high redundancy is present between shape information of a motion region of the same object in successive images. As a result, because the coding operation is unconditionally performed with no consideration of a coding efficiency regarding time-axis shape information of the image, the compression coding efficiency is degraded.
Recently, ISO/IEC/WG11 has considered a method of coding an object with arbitrary shape information, differently from MPEG-1 and MPEG-2 performing frame-unit coding.
Here, a given video is divided into a background image and an object image, and a rectangle including the divided background image and object image is defined as a video object plane (referred to hereinafter as VOP). In MPEG-4, in the case where object regions including desired objects or areas are present in images, they are divided into VOPs and the divided VOPs are coded individually.
Such a VOP has the advantage of freely synthesizing or disintegrating a natural image or an artificial image as the unit of object image. As a result, the VOP is a fundamental factor in processing an object image in fields such as computer graphics, multimedia, etc.
FIG. 2 is a view illustrating a conventional VOP with shape information, which is partitioned into macro blocks. As shown in this drawing, a horizontal size of the VOP is defined as a VOP width and a vertical size thereof is defined as a VOP height. The left top corner of the VOP is defined as a grid start point, and the VOP is partitioned into M.times.N macro blocks, each of which includes M pixels on the X-axis and N pixels on the Y-axis. For example, the VOP may be partitioned into 16.times.16 macro blocks, each of which includes 16 pixels on the X-axis and 16 pixels on the Y-axis.
Noticeably, in the case where macro blocks at the rightmost and bottom portions of the VOP do not include M pixels on the X-axis and N pixels on the Y-axis, respectively, the VOP is enlarged in size in such a manner that the X and Y-axis pixels of each of the macro blocks can be M and N in number, respectively.
Both M and N are set to an even number so that a texture coder can perform sub block-unit coding, as will be mentioned later.
A redundancy is present between contours of a motion region on the time-axis. Such a redundancy must be removed to make compression coding efficient. Namely, in the case where motion of shape information of the current VOP is negligibly small, shape information of the previous VOP can be directly used. In this case, there is no necessity for coding the shape information of the current VOP to transmit it. However, conventionally, shape information of a given VOP is unconditionally coded and transmitted to a decoder. As a result, image coding and compression efficiencies are degraded.