This invention relates to an image processing apparatus and method, and an image processing system and, more particularly, to an image processing apparatus and method, and an image processing system, which extract and synthesize objects in a color image.
In recent years, MPEG4 (Moving Picture Experts Group Phase 4) is being specified as new international standards of moving image coding schemes.
In a conventional moving image coding scheme represented by MPEG2, coding is done in units of frames or fields. However, in order to re-use or edit contents (person and building, voice, sound, background, and the like) which form the video and audio parts of a moving image or moving picture, MPEG4 is characterized by handling video and audio data as objects. Furthermore, objects contained in video data are independently encoded, and can be independently handled.
According to MPEG4, encoding/decoding is done in units of objects, thus allowing various applications in units of objects, such as improvement of coding efficiency, data distribution depending on transmission paths, re-processing of images, and the like, which cannot be attained by the conventional scheme.
In this manner, with the advent of MPEG4, processes for separating/synthesizing an image in units of objects by exploiting digital techniques has received a lot of attention.
Objects to be handled by MPEG4 includes shape data indicating a shape and xcex1 data indicating transparency of an object in addition to texture data indicating a pattern itself by luminance (Y) data and color difference (chroma) data. However, if an object does not have a semi-transparent state, xcex1 data is omitted. Hence, a description that pertains to xcex1 data will be omitted hereinafter.
In general, moving image data has a format called 4:2:0 obtained by subsampling chroma data to 1/2 with respect to Y data in both the horizontal and vertical directions in view of visual characteristics and data amount. FIG. 27 shows this format, i.e., an example of pixel matrices of Y and chroma data in chroma-subsampled moving image data. As can be seen from FIG. 27, one Cr/Cb chroma pixel is sampled per four Y pixels.
Also, an object having an arbitrary shape in MPEG4 is extracted as a region 1001 called a xe2x80x9cbounding boxxe2x80x9d that circumscribes an object, as shown in FIG. 28. That is, an object in FIG. 28 is a human figure within the bounding box 1001. The bounding box 1001 has a size corresponding to an integer multiple of that of a macroblock 1002, and its absolute position is expressed by the distance from the upper left corner of a frame.
FIG. 29 shows the configuration of the macroblock 1002. That is, the macroblock 1002 is configured by four 8xc3x978 Y data blocks, a pair of 8xc3x978 Cb and Cr data blocks, and 16xc3x9716 shape data.
As a method of extracting/synthesizing objects from a moving image, a scheme called blue-back chroma keying is well known. In this scheme, a blue background is prepared in advance in a studio set in color TV broadcast, and is replaced by another background image by a switcher for an image obtained by sensing an arbitrary object in that background. Hence, when an object is to be extracted from that sensed image, the blue background portion can be processed as a dataless portion without being replaced by another background data.
Upon extracting objects from a still image, a method of extracting an object by detecting an edge portion, a method of extracting an object by setting a threshold value for a signal level, and the like are known.
However, the conventional object extraction/synthesis method suffers the following problems.
In an object extraction/synthesis process complying with MPEG4, Y data does not pose any problem since it has the same resolution as that of shape data. However, since the horizontal and vertical resolutions of chroma data are half those of the shape data, if an object boundary is defined by the resolution of shape data, chroma pixels may extend across the boundary of an object depending on the data shape in a boundary macroblock. In such case, chroma pixels that extend across the boundary include both colors inside and outside the object.
This brings about the following two problems.
First, in the extracted object, since chroma data that includes the outside color has a value different from the neighboring chroma data in the object, the coding efficiency of that object lowers.
Second, since the edge portion of the extracted object has a false color (color outside the object), when that object is synthesized with another image, and the synthesized image is displayed, it looks unnatural.
Accordingly, it is an object of the present invention to provide an image processing apparatus and method, and an image processing system, which can extract objects without any coding efficiency drop of an image.
According to the present invention, the foregoing object is attained by providing an image processing apparatus comprising: an image processing apparatus comprising: input means for inputting data which represents an image including an object image; shape generation means for generating shape data which represents a shape of the object image; first texture generation means for generating first texture data which represents luminance of the object image on the basis of a luminance signal in the data; second texture generation means for generating second texture data which represents color of the object image on the basis of color difference signals in the data; and output means for outputting the shape data, and the first and second texture data, wherein the second texture generation means generates second texture data corresponding to a boundary portion of the object image using a generation method different from a generation method for other portion.
With this apparatus, the coding efficiency of an object extracted from an image can be improved.
It is another object of the present invention to provide an image processing apparatus and method, and an image processing system, which can realize natural color reproduction of the extracted object.
According to the present invention, the foregoing object is attained by providing an image processing apparatus comprising: an image processing apparatus comprising: object input means for inputting image data which represents an object; shape generation means for generating shape data which represents a shape of the object from the image data; texture generation means for generating first texture data which represents luminance of the object and second texture data which represents color difference of the object from the image data; target image input means for inputting target image data with which the object is to be synthesized; and synthesis means for synthesizing the first and second texture data with the target image data on the basis of the shape data, wherein the synthesis means generates second texture data corresponding to a boundary portion of the object.
With this apparatus, upon synthesizing an object with a background image, natural color reproduction can be assured at their boundary.
The invention is particularly advantageous since objects can be extracted without any coding efficiency drop. Also, an object can be synthesized with a background image while assuring natural color reproduction.