In image coding, a method of superimposing different motion picture sequences has been studied. In an article entitled "An Image Coding Scheme Using Layered Representation and Multiple Templates" (Technical Report of IEICE, IE94-159, pp. 99-106 (1995)) discloses a method of forming a new sequence by superimposing a motion picture sequence as a background and a motion picture sequence of a component motion picture or image as a foreground (for example, video image of a character or fish cut out by chromakey technique).
An article "Temporal Scalability Based on Image Content", ISO/IEC/JTC 1/SC29/WG11 MPEG95/211(1995) discloses a method of forming a new sequence by superimposing a motion picture sequence of component motion images having high frame rate on a motion picture sequence having a low frame rate.
According to this method, referring to FIG. 27, prediction coding is performed at a low frame rate at a lower layer, and prediction coding is performed at a high frame rate only at a selected area (hatched portion) of an upper layer. However, a frame coded in the lower layer is not coded in the upper layer, but decoded image of the lower layer is copied and used as it is. It is assumed that a portion to which a viewer pays attention, such as a figure or a character is selected as the selected area.
FIG. 26 is a block diagram showing a main portion of a conventional motion picture coding and decoding apparatus. Referring to the left side of FIG. 26, in a coding apparatus of the conventional motion picture and encoding apparatus, first and second skipping units 801 and 802 thin out frames of input motion picture data. The input image data thus comes to have lower frame rate and input to upper layer coding unit 803 and lower layer coding unit 804, respectively. It is assumed that the frame rate for the upper layer is not lower than the frame rate of the lower layer.
Input motion picture as a whole is coded in lower layer coding unit 804. Internationally standardized method of motion picture coding such as MPEG or H.261 is used as the coding method. A decoded image of the lower layer is formed in lower layer coding unit 804, which image is utilized for prediction coding and at the same time, input to a superimposing unit 805.
Only the selected area of the input motion picture is coded in upper layer coding unit 803 of FIG. 26. The internationally standardized method of motion picture coding such as MPEG or H.261 is also used here. Only the selected area is coded, however, based on area shape information. A frame which has already been coded in the lower layer is not coded in the upper layer. The area shape information represents shape of the selected area such as a figure portion, and is a binary image assuming the value 1 at the position of the selected area and the value 0 at other positions. Only the selected area of the motion picture is coded in upper layer coding unit 803, and input to superimposing unit 805.
The area shape is coded utilizing 8 directional quantizing code in an area shape coding unit 806. FIG. 25 depicts the 8 directional quantizing code. As can be seen from the figure, the 8 directional quantizing code represents a direction to a next point by a numerical value, which is generally used for representing a digital figure.
At a frame position where a lower layer frame has been coded, superimposing unit 805 outputs a decoded image of the lower layer. At a frame position where the lower layer frame has not been coded, the superimposing unit forms an image by using coded images of preceding and succeeding two coded lower layers of the frame of interest and one upper layer decoded image of the same time point, and outputs the formed image. The image formed here is input to upper layer coding unit 803 and utilized for prediction coding. The method of forming the image in the superimposing unit 805 is as follows.
First, an interpolated image of two lower layers is formed. A decoded image of a lower layer at a time point t is represented as B (x, y, t). Here, x and y are coordinates representing pixel position in a space. When we represent time points of the two lower layers as t1 and t2 and the time point for the upper layer as t3 (where t1&lt;t3&lt;t2), the interpolated image I (x, y, t3) at time point t3 is calculated as follows. EQU I(x, y, t3)=[(t2-t3)B(x, y, t1)+(t3-t1)B(x, y, t2)]/(t2-t1) (1)
Thereafter, a decoded image E of the upper layer is superimposed on the interpolated image I calculated as above. For this purpose, weight information W(x, y, t) for superimposing is formed from area shape information M(x, y, t), and a superimposed image S is obtained in accordance with the following equation. EQU S(x, y, t)=[1-W(x, y, t)]I(x, y, t)+E(x, y, t)W(x, y, t) (2)
The area shape information M(x, y, t) is a binary image which assumes the value 1 in the selected area and the value 0 outside the selected area. The image passed through a low pass filter for a plurality of times provides weight information W(x, y, t).
More specifically, the weight information W(x, y, t) assumes the value 1 in the selected area, 0 outside the selected area, and a value between 0 and 1 at a boundary of the selected area. The operation of superimposing unit 805 is as described above.
The coded data coded by lower layer coding unit 804, upper layer coding unit 803 and area shape coding unit 806 are integrated by a coded data integrating unit, not shown, and transmitted or stored.
The method of decoding in the conventional apparatus will be described in the following. Referring to the right side of FIG. 26, in the decoding apparatus, coded data are decomposed by a coded data decomposing unit, not shown into coded data for the lower layer, coded data for the upper layer and the coded data for the area shape. The coded data are decoded by a lower layer decoding unit 808, an upper layer decoding unit 807 and an area shape decoding unit 809, as shown in FIG. 26. A superimposing unit 810 of the decoding apparatus is similar to superimposing unit 805 of the coding apparatus. Using the lower layer decoded image and the upper layer decoded image, images are superimposed by the same method as described with respect to the coding side. The superimposed motion picture is displayed on a display, and input to upper layer decoding unit 807 to be used for prediction of the upper layer.
Though a decoding apparatus for decoding both the lower and upper layers has been described, in a decoding apparatus having only a unit for decoding the lower layer, upper layer decoding unit 807 and superimposing unit 810 are unnecessary. As a result, part of the coded data can be reproduced in a smaller hardware scale.
In the conventional art, as represented by the equation (1), when an output image is to be obtained from two lower layer decoded images and one upper layer decoded image, interpolation between two lower layers is performed. Accordingly, when a position of the selected area changes with time, there would be a considerable distortion around the selected area, much degrading the image quality.
FIGS. 28A to 28C are illustrations of the problem. Referring to FIG. 28A, images A and C represent two decoded images of the lower layer, and image B is a decoded image of the upper layer, and the time of display is in the order of A, B and C. Here, selected areas are hatched. In the upper layer, only the selected area is coded, and hence areas outside the selected area are represented by dotted lines. As the selected area moves, an interpolated image obtained from images A and C has two selected areas superimposed as shown by the screened portion of FIG. 28B.
When image B is superimposed using weight information, the output image has three selected areas superimposed as shown in FIG. 28C. Particularly, around (outside) the selected area of the upper layer, the selected areas of the lower layers appear like after images, which significantly degrade the image quality. When the lower layer only is displayed, there is not the aforementioned distortion in the motion picture as a whole, and when the superimposed image of the upper and lower layers is displayed, there appears the aforementioned distortion, and therefore flicker type distortion is generated in the motion picture, which causes extremely severe degradation of image quality.
International standardization (ISO/IEC MPEG4) of the motion picture coding method proposes coding, decoding and synthesizing of images having a plurality of component parts by a coding apparatus and a decoding apparatus having hierarchical structures such as shown in FIG. 29. Here, a component image refers to an image cut out as a component, such as a character or an object in the motion picture. Common motion picture itself is also treated as one of the component images. Generally, among coded data, identification numbers of respective component images are coded and, on the decoding side, the identification numbers are decoded and based on the decoded identification numbers, coded data corresponding to the desired component images are selected.
FIGS. 30A to 30E schematically depict component images and the manner of synthesizing the images. Component image 1 of FIG. 30A is a common motion picture representing background, and component image 2 of FIG. 30B is a motion picture obtained by cutting out a figure only. Component image 3 of FIG. 30C is a motion picture obtained by cutting out a car only. When only the component image 1 is decoded among the coded data, an image of background only corresponding to FIG. 30A is obtained. When component images 1 and 2 are decoded and synthesized, an image such as shown in FIG. 30D is reproduced. When component image 3 is decoded and these three component images are synthesized, an image such as shown in FIG. 30E is reproduced. Here, such a hierarchical nature is referred to as hierarchy of component images.
The conventional coding and decoding apparatuses having hierarchical structure as described above do not have the function of hierarchically coding and decoding image quality of each component image. Here, the image quality refers to spatial resolution of the component image, number of quantization levels, frame rate and so on.