In recent years, scalable encoding has gained attention because it can help to cope with diversified network and terminal environments. In scalable encoding, an image signal is hierarchically decomposed so as to perform encoding for each layer. The following hierarchical decomposition methods are known:    (i) band splitting with respect to spatial frequency, and    (ii) band splitting with respect to temporal frequency.
For type (i), wavelet decomposition (see, for example, Non-Patent Document 1) is a representative example, and for type (ii), motion compensation temporal filtering (MCTF) (see, for example, Non-Patent Document 2) is a representative example.
Additionally, as techniques relating to the encoding of each hierarchical layer, EBCOT (see, for example, Non-Patent Document 3), 3D-ESCOT (see, for example, Non-Patent Document 4), and the like are known.
Non-Patent Document 1: “A theory for multiresolution signal decomposition: the wavelet representation”, S. G. Mallat, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 11, No. 7, pp. 674-693, July, 1989.
Non-Patent Document 2: “Three-dimensional subband coding with motion compensation”, J. R. Ohm, IEEE Trans. on Image Processing, Vol. 3, No. 5, pp. 559-571, September, 1994.
Non-Patent Document 3: “High performance scalable image compression with EBCOT”, D. Taubman; IEEE Trans. on Image Processing, Volume: 9, Issue: 7, pp. 1158-1170, July, 2000.
Non-Patent Document 4: “Three-Dimensional Embedded Subband Coding with Optimized Truncation (3-D ESCOT)”, J. Xu, Z. Xiong, S. Li, and Y. Zhang, Applied and Computational Harmonic Analysis 10, pp. 290-315, 2001.
Generally, the amount of codes allocated to each layer is determined before encoding, in consideration of the band of the network, or the like.
FIG. 1 shows an example of the layer setting in the scalable encoding in which an image signal, whose number of pixels is based on the 4CIF standard (i.e., 704×576 pixels, 60 frames/sec), is decomposed into six layers (see “Description of Core Experiments in SVC”, ISO/IEC JTC 1/SC 29/WG 11 N6373, March, 2004).
When the signal belonging to each layer is independently encoded under the above condition, the image may have fatal degradation in the quality of the image. Regarding an image having a lower layer in which data is concentrated, if a sufficient amount of codes is not allocated to the lower layer, considerable encoding distortion occurs. However, each layer is independently encoded; thus, the degraded lower layer cannot be restored to the former state even by adding data of the higher layer to the data of the lower layer. Therefore, in the case of such an image having the lower layer in which data is concentrated, even when a decoded signal of the higher layer is added to the lower layer, improvement in the quality of the decoded image cannot be expected. Such a problem is caused by the condition such that the amount of codes allocated to each layer is fixed regardless of the type of image, and that the encoding of each layer is independently performed.