An image (especially a moving image) has a very large data amount. Hence, compression processing of largely decreasing the data amount is indispensable for storage or transmission of an image. As processing of compressing (encoding) a moving image, international standards such as MPEG-1 or MPEG-2 are already defined. These schemes are applied to only an image in a rectangular region, which is sensed with, e.g., a TV camera.
Images have various characteristic features. There are many images each formed from background and an object to be sensed (object) before the background. Generally, a background image exhibits no large motion but moves as a whole as the camera used for sensing moves, or various components in the background exhibit delicate movement. To the contrary, an object sometimes largely moves. That is, an object and background have different features.
MPEG-4 which is being standardized next to MPEG-1 and MPEG-2 is designed to form an active relationship to an image and allow new expressions by separately treating an object and background, reusing a useful object to increase the productivity of a moving image content, and preparing an environment that allows an image viewer side to manipulate an object.
However, since an object has not a rectangular shape but an arbitrary shape, the compression technique used in the conventional MPEG-1 or MPEG-2 cannot be directly applied.
In compressing a moving image, the data amount is further reduced using correlation between frames. Use of correlation between frames means that in encoding data of the current frame of an object, a strongly correlated region in another frame is referred to, and the difference value between the frames is encoded.
When an object has an arbitrary shape, the object in another frame to be referred to also has an arbitrary shape, and no value is present outside the object, so motion vector information for each block cannot be obtained.
In this case, padding processing is executed for the object of interest to extend it to a rectangular region, and then, a motion vector is searched for in units of blocks.
Padding processing for image data in a two-dimensional range is implemented by sequentially executing one-dimensional padding processing in the horizontal and vertical directions. The one-dimensional padding processing is executed in the following way.
An external region (row or column) sandwiched between a plurality of objects is replaced with the average value of pixel data in the objects at the two ends of that region. Another region outside the objects is replaced with pixel data in the object in contact with the region.
FIGS. 10A to 10C are views showing an example of the padding processing. FIGS. 10A to 10C show binary shape information (attribute data) and pixel data of one row in a block to explain horizontal (lateral) padding processing. Since pixel data in a region outside an object is replaced with another value, the original value is almost insignificant. Hence, pixel data values in regions outside objects are omitted.
FIG. 10A shows binary shape information, and FIG. 10B shows pixel data in object regions. In this example, the number of pixels of one row in a block is 16. The 16 pixels has four regions outside the objects. More specifically, the regions outside the objects are a region formed from one pixel on the left side of a pixel with a pixel value “78”, a region formed from four pixels between a pixel with a pixel value “74” and a pixel with a pixel value “56”, a region formed from two pixels between a pixel with a pixel value “64” and a pixel with a pixel value “42”, and a region formed from three pixels on the right side of the pixel with a pixel value “42”.
Since the regions at the two ends are replaced with pixel data in the objects in contact with these regions, the pixel at the left end is replaced with a pixel with a pixel value “78”, and the three pixels at the right end are replaced with a pixel with a pixel value “42”.
The two remaining regions are sandwiched by pixels in the objects on the left and right sides. Hence, the four pixels on the left side are replaced with a pixel having an average value “65” of the pixel values “74” and “56”, and the two pixels on the right side are replaced with a pixel having an average value “53” of the pixel values “64” and “42”. Pixel data as shown in FIG. 10C is obtained as a result of padding processing.
Block data is formed from a plurality of row data. When the horizontal padding processing is executed for each row data, pixel data in object regions shown in FIG. 11A are extended to those shown in FIG. 11B. Each hatched portion represents regions in the objects and pixel regions where data are padded by padding processing.
Vertical padding processing is executed next to the horizontal padding processing. The vertical padding processing method is the same as the horizontal padding processing method except that the processing unit changes from a row to a column. After the vertical padding processing, the entire block is filled with significant pixel data, as shown in FIG. 11C.