1. Field of the invention
This invention relates to a picture signal encoding method and apparatus, a picture signal decoding method and apparatus and a recording medium, which can be applied with advantage to recording moving picture signals on, for example, a recording medium, such as a magneto-optical disc or a magnetic tape for later reproduction on a display device, transmitting moving picture signals from a transmitting side over a transmission channel for reception and display on the receiving side, as in a teleconferencing system, television telephone system, broadcast equipment or in a multi-media data base retrieval system, or to editing and recording moving picture signals.
2. Description of Related Art
In a system for transmitting moving picture signals to a remote site, such as a teleconferencing system or a television telephone system, the picture signals are compressed by encoding by taking advantage of line-to-line correlation or frame-to-frame correlation.
Among illustrative high efficiency encoding systems for moving picture signals is a so-called MPEG system, which is a system for encoding moving picture signals for storage. This system, discussed by ISO-IEC/JTC1/SC2/WG11 and proposed as a standard draft, employs a hybrid system which is the combination of the motion compensation predictive coding and discrete cosine transform (DCT). In MPEG, a number of profiles and levels are defined for coping with various applications and functions. The most basic is the main profile main level (MP@ML).
Referring to FIG. 21, an illustrative structure of an encoder of MP@ML of the MPEG system is explained.
An input picture signal is first entered to a frame memory 201 from which it is subsequently read out and sent to a downstream side circuitry for encoding in a pre-set sequence.
Specifically, picture signals to be encoded are read out on the macro-block basis from a frame memory 201 so as to be entered to a motion vector detection circuit 202 (ME). The motion vector detection circuit 202 processes the picture data of the respective frames as I-, P- or B-pictures, in accordance with a pre-set sequence. It is predetermined by which of the I-, P- or B-pictures the pictures of sequentially entered respective frames are processed. For example, the sequentially entered pictures are processed in the sequence of I, B, P, B, P, . . . , B P.
The motion vector detection circuit 202 refers to a pre-set reference frame to do motion compensation to detect the motion vector. There are three modes for motion compensation (inter-frame prediction), that is forward prediction, backward prediction and bi-directional prediction. The prediction mode for the P-picture is solely the forward prediction, while there are three prediction modes for the B-picture, that is, forward prediction, backward prediction and bi-directional prediction. The motion vector detection circuit 202 selects the prediction mode which minimizes prediction errors and generates the motion vector for the selected mode.
The prediction errors are compared to, for example, the variance values of the macro-block being encoded. If the variance value of the macro-block is smaller, prediction is not executed for the macro-block. Instead, the intra-frame encoding is executed. In this case, the prediction mode is the intra-frame encoding. The information of the motion vector and the prediction mode is entered to a variable length encoding circuit 206 and to a motion compensation circuit 212 (MC circuit).
The motion compensation circuit 212 generates prediction reference picture signals, based on the pre-set motion vector, and sends the prediction reference picture signals to an arithmetic unit 203. The arithmetic unit 203 finds a difference between the value of the picture signals for encoding, supplied from the frame memory 201, and the value of the prediction reference picture signals from the motion compensation circuit 212 for each macro-block, and outputs a difference signal to a DCT circuit 204. In case of the intra-macro-block (macro-block encoded by intra-picture coding), the arithmetic unit 203 directly outputs the macro-block of the picture signals for encoding to the DCT circuit 204.
The DCT circuit 204 processes the difference signals from the arithmetic unit 203 or the picture signals per se with DCT for conversion into DCT coefficients. These DCT coefficients are entered to a quantization circuit 205 so as to be quantized with a quantization step in meeting with the stored data volume in a transmission buffer 207 (residual data volume that can be stored in a buffer) and so as to be entered as quantized data to the variable length encoding circuit 206.
The variable length encoding circuit 206 converts the quantized data supplied from the quantization circuit 205, in accordance with the quantization step (quantization scale) supplied from the quantization circuit 205 into, for example, variable length codes, such as Huffman codes, for outputting the encoded data to the transmission buffer 207.
The variable length encoding circuit 206 is also fed with the quantization step (quantization scale) from the quantization circuit 205, the prediction mode (prediction mode indicating which of the intra-picture prediction, forward prediction, backward prediction or bi-directional prediction has been set) from the motion vector detection circuit 202, and with the motion vector, so as to be encoded with VLC.
The transmission buffer 207 transiently stores the input data and outputs to the quantization circuit 205 data corresponding to the stored data volume as quantization control signals by way of performing buffer feedback. That is, when the data volume stored in the transmission buffer 207 (the residual data volume that can be stored therein) is increased to a theoretical upper limit value, the buffer 207 causes the quantization scale of the quantization circuit 205 to be increased by the quantization control signal to lower the data volume of the quantized data outputted by the quantization circuit 205. Conversely, should the stored data volume (residual data volume that can be stored) be decreased to an allowable lower limit value, the transmission buffer 207 reduces the quantization scale of the quantization circuit 205 by the quantization control signal to increase the data volume of the quantized data outputted by the quantization circuit 205. This prevents overflow or underflow of the transmission buffer 207 from occurring.
The encoded data stored in the transmission buffer 207 is read out at a pre-set timing so as to be outputted as a bitstream on the transmission channel.
The quantized data outputted by the quantization circuit 205 is also entered to an inverse quantization circuit (IQ) 208. This inverse quantization circuit 208 inverse-quantizes the quantized data supplied from the quantization circuit 205 in accordance with the quantization step similarly supplied from the quantization circuit 205. An output signal of the inverse quantization circuit 208 (DCT coefficients obtained on inverse quantization) are entered to an IDCT circuit 209, an inverse-quantized output signal of which is sent to an arithmetic unit 210. If an output signal of the IDCT circuit 209 is a difference signal, the arithmetic unit 210 sums the difference signal from the IDCT circuit 209 to the picture signals from the motion compensation circuit 212 for restoring the picture signals which are then stored in a frame memory 211. Meanwhile, if the output signal of the IDCT circuit 209 is an intra-macro-block, the output signals of the IDCT circuit 209 (picture signals) are directly outputted. The motion compensation circuit 212 generates prediction reference picture signals using the picture of the frame memory 211, motion vector and the prediction mode.
Referring to FIG. 22, an illustrative structure of the MP@ML decoder of MPEG is explained.
The encoded picture data transmitted over a transmission channel (bitstream) is received by a reception circuit, not shown, or reproduced by a reproducing device so as to be transiently stored in a reception buffer 221 and so as to be supplied as encoded data to a variable-length decoding circuit 222 as encoded data. The variable-length decoding circuit 222 variable-length decodes the encoded data supplied from the reception buffer 221 to output the resulting motion vector and prediction mode to a motion compensation circuit 227 while outputting the quantization step to an inverse quantization circuit 223 and outputting the variable-length decoded data (quantized data) to the inverse quantization circuit 223.
The inverse quantization circuit 223 inverse-quantizes data supplied from the variable-length decoding circuit 222 in accordance with the quantization step supplied similarly from the variable-length decoding circuit 222 to output the resulting DCT coefficients to an IDCT circuit 224. The DCT coefficients, outputted by the inverse quantization circuit 223, are inverse DCTed by the IDCT circuit 224, an output signal of which (picture signal or the difference signal) is sent to an arithmetic unit 225.
If the output data from the IDCT circuit 224 is an I-picture signal, the picture signal is directly outputted by the arithmetic unit 225 so as to be sent to and stored in a frame memory 226 for generation of the prediction reference picture for the reference signal subsequently entered to this arithmetic unit 225 (data of the P- or B-picture). The picture signals from this arithmetic unit 225 are directly outputted to outside as a playback picture.
If the input bitstream is a P- or B-picture, the motion compensation circuit 227 generates prediction reference picture signals in accordance with the prediction mode and the motion vector supplied from the variable-length decoding circuit 222 to output the prediction reference picture signals to the arithmetic unit 225. The arithmetic unit 225 sums the difference picture signal supplied from the IDCT circuit 224 and the prediction reference picture signals supplied from the motion compensation circuit 227 to output the resulting sum signal as a playback picture. If the input bitstream is a P-picture, the picture signals from the arithmetic unit 225 are entered to and stored in the frame memory 226 so as to be used as a reference picture for the next-decoded picture signals.
In MPEG, a variety of profiles and levels are defined in addition to MP@ML and a variety of tools are also readied. One of such tools of MPEG is scalability as now explained.
In MPEG, a scalable encoding system, designed to implement scalability in keeping with the different picture size or the different frame rate, is introduced. If, in case of spatial scalability, only the bitstream of the lower layer is decoded, picture signals of small picture size are decoded. On the other hand, if the bitstreams of the lower layer and the upper layer are decoded, picture signals of large picture size are decoded.
Referring to FIG. 23, a spatial scalability encoder is explained. In the case of the spatial scalability, the lower layer and the upper layer correspond to picture signals of the small picture size and to those of the large picture size, respectively.
The picture signals of the lower layer are first entered to a frame memory 261 so as to be encoded by the downstream side circuitry in the same manner as the above-mentioned MP@ML.
That is, data read out on the macro-block basis from the frame memory 261 are entered to a motion vector detection circuit 262. The motion vector detection circuit 262 processes picture data of the respective frames as I-, P- or B-pictures in accordance with a pre-set sequence.
The motion vector detection circuit 262 refers to a pre-set reference frame (that is, forward original picture, backward original picture or an original picture) to perform motion compensation to detect the motion vector. There are three sorts of the prediction mode, that is the forward prediction, backward prediction and bi-directional prediction. The motion vector detection circuit 262 selects the prediction mode which minimizes the prediction errors and generates the corresponding motion vector. The information on the prediction mode and the motion vector is entered to a variable length encoding circuit 266 and to a motion compensation circuit 272.
The motion compensation circuit 272 generates prediction reference picture signals, based on a pre-set motion vector, to supply the prediction reference picture signals to an arithmetic unit 263. The arithmetic unit 263 finds a difference signal between the value of picture signals for encoding from the frame memory 261 and the values of a macro-block of the prediction reference picture signals from the motion compensation circuit 272 for each macro-block to output the difference signal to a DCT circuit 264. If the macro-block is an intra-macro-block, that is an intra-picture coded macro-block, the arithmetic unit 263 directly outputs the signals of the encoded macro-block to the DCT circuit 264.
The DCT circuit 264 processes the difference signals from the arithmetic unit 263 with DCT for converting the difference signals into DCT coefficients. These DCT coefficients are entered to a quantization circuit 265 for quantization in accordance with the quantization step in keeping with the stored data quantity in the transmission buffer 267 (residual data volume that can be stored in the buffer) so as to be entered as quantized data to the variable length encoding circuit 266.
The variable length encoding circuit 266 converts the quantized data supplied from the quantization circuit 265 for quantization in accordance with the quantization step (quantization scale) supplied from the quantization circuit 265 into variable-length codes, such as Huffman codes, for outputting the encoded data to the transmission buffer 267.
The variable length encoding circuit 266 is also fed with the quantization step (quantization scale) from the quantization circuit 265 and with the prediction mode from the motion vector detection circuit 262 (prediction mode indicating which of the intra-picture prediction, forward prediction, backward prediction or bi-directional prediction has been set). These data are similarly encoded by VLC.
The transmission buffer 267 transiently stores the encoded input data and outputs data corresponding to the stored data volume as a quantization control signal to the quantization circuit 265 by way of buffer feedback. This prevents overflow or underflow in the transmission buffer 267 from occurring.
The encoded data stored in the transmission buffer 267 is read out at a pre-set timing so as to be outputted as a bitstream to the transmission channel.
The encoded data outputted from the quantization circuit 265 are also entered to an inverse quantization circuit 268 which then inverse-quantizes the quantized data from the quantization circuit 265 in accordance with the quantization step similarly supplied from the quantization circuit 265. An output signal (DCT coefficients) of the inverse quantization circuit 268 is entered to an IDCT circuit 269 for inverse quantization. An output signal (picture signal or difference signal) is sent to an arithmetic unit 270. If the output signal of the IDCT circuit 269 is the P-picture difference signal, the arithmetic unit 270 sums the difference signal from the IDCT circuit 269 to the picture signal from the motion compensation circuit 272 for restoration of the picture signals. If the output signal of the IDCT circuit 269 is the intra-coded macro-block, the picture signals from the IDCT circuit 269 are outputted directly. These picture signals are stored in a frame memory 271. The motion compensation circuit 272 generates prediction reference picture signals using the picture signals of the frame memory 272, motion vector and the prediction mode.
It is noted that, in this illustrative structure of the lower layer, an output image signal of an arithmetic unit 270 is not only supplied to the frame memory 271 so as to be used as a reference picture of the lower layer, but is enlarged to the same image size as the image size of the upper layer by the picture enlarging circuit 243, adapted for enlarging the picture by up-sampling, so as to be used also as a reference picture of the upper layer.
That is, the picture signals from the arithmetic unit 270 are entered to the frame memory 271 and to the picture enlarging circuit 243, as described above. The picture enlarging circuit 243 enlarges the picture signals generated by the arithmetic unit 270 to the same size as the picture size of the upper layer in order to output the enlarged picture signals to a weighting addition circuit 244.
The weighting addition circuit 244 multiplies the output signal from the picture enlarging circuit 243 with a weight (1-W) to output the resulting multiplied signal to an arithmetic unit 258.
On the other hand, the picture signals of the upper layer are first supplied to the frame memory 245. As in the case of the MP@ML, a motion vector detection circuit 246 sets the motion vector and the prediction mode.
In the above-described structure of the upper layer, the motion compensation circuit 256 generates prediction reference picture signals in accordance with the motion vector and the prediction mode set by the motion vector detection circuit 246. These prediction reference picture signals are sent to a weighting addition circuit 257 which then multiplies the prediction reference picture signals with a weight W (weighting coefficients W) to output the resulting product signals to the arithmetic unit 258.
The arithmetic unit 258 sums the picture signals of the weighting addition circuits 244, 257 to output the resulting picture signals as the prediction reference picture signals to the arithmetic unit 247. The picture signals from the arithmetic unit 258 are also supplied to an arithmetic unit 254 so as to be summed to the picture signals from the inverse DCT circuit 253. The resulting sum signal is supplied to a frame memory 255 so as to be used as reference picture signals for the picture signals encoded next time.
The arithmetic unit 247 calculates the difference between the picture signals from the frame memory 245 and the prediction reference picture signals from the arithmetic unit 258 to output the resulting difference signals. If the macro-block is the intra-frame coded macro-block, the arithmetic unit 247 directly outputs the picture signals to the DCT circuit 248.
The DCT circuit 248 processes the output signal of the arithmetic unit 247 with discrete cosine transform (DCT) to generate DCT coefficients which are outputted to a quantization circuit 249. As in the case of the MP@ML, the quantization circuit 249 quantizes the DCT coefficients in accordance with the quantization scale as set on the basis of the data storage volume of the transmission buffer 251 to output the quantized DCT coefficients as quantized data to a variable length encoding circuit 250, which then variable-length encodes the quantized data to output the encoded data via transmission buffer 251 as a bitstream of the upper layer.
The quantized data from the quantization circuit 249 also is inverse-quantized by an inverse quantization circuit 252 using the quantization scale used in the quantization circuit 249. An output data (DCT coefficients) of the inverse quantization circuit 252 is sent to an inverse DCT circuit 253. The inverse DCT circuit 253 processes the DCT coefficients with DCT so that an output signal (picture signal or difference signal) is sent to the arithmetic unit 254. If the output signal of the inverse DCT circuit 253 is the difference signal of a P-picture, the arithmetic unit 254 sums the picture signals from the arithmetic unit 258 to the difference signal from the inverse DCT circuit 253 for restoration of the picture signals. If the output signal of the inverse DCT circuit 253 is the intra-coded macro-block, the picture signals are directly outputted from the inverse DCT circuit 253. These picture signals are recorded in the frame memory 255. A motion compensation circuit 256 generates prediction reference picture signals, using picture signals from the frame memory 255, motion vector and prediction mode.
The variable length encoding circuit 20 also is fed with the prediction mode and the motion vector, detected by the motion vector detection circuit 246, the quantization scale used in the quantization circuit 249 and with the weight W used in the weighting addition circuits 244, 257. These are encoded and transmitted.
Referring to FIG. 24, an illustrative decoder of spatial scalability is explained.
The bitstream of the lower layer is entered to a reception buffer 301 so as to be subsequently decoded as in the case of the MP@ML. That is, the encoded data read out from the reception buffer 301 are sent to the variable length decoding circuit 302. The variable length decoding circuit 302 variable-length decodes the encoded data supplied from the reception buffer 301 to output the motion vector and the prediction mode to a motion compensation circuit 307, as well as to output the quantization step and the variable-decoded data (quantized data) to an inverse quantization circuit 303 on the macro-block basis.
The inverse quantization circuit 303 inverse-quantizes the data (quantized data) supplied from the variable length decoding circuit 302 in accordance with the quantization step similarly supplied from the variable length decoding circuit 302. The resulting inverse-quantized data (DCT coefficients) are outputted to an IDCT circuit 304. The DCT coefficients, outputted by the inverse quantization circuit 303, are inverse DCTed by the IDCT circuit 304 to send an output signal (picture signal or the difference signal) to the arithmetic unit 305.
If the output signal from the IDCT circuit 304 is the I-picture data, the picture signals, these signals are directly outputted from the arithmetic unit 305 and supplied to the frame memory 306 for storage therein for generating prediction reference picture signals of the difference signals subsequently entered to the arithmetic unit 305. The picture signals are also directly outputted to outside as a playback picture.
On the other hand, if the input bitstream is the P- or B-picture, the motion compensation circuit 307 generates prediction reference picture signals, in accordance with the prediction mode and the motion vector supplied from the variable-length decoding circuit 302, and outputs the prediction reference picture signals to the arithmetic unit 305. The arithmetic unit 305 sums the difference signal entered from the IDCT circuit 304 and the prediction reference picture signals supplied from the motion compensation circuit 307 to output the sum as picture signals. If the input bitstream is a P-picture, the picture signals from the arithmetic unit 305 are entered to and stored in the frame memory 306 so as to be used as prediction reference picture signals for the next-decoded picture signals.
In the configuration of FIG. 24, the picture signals from the arithmetic unit 305 are not only outputted to outside and stored in the frame memory 306 so as to be used as prediction reference picture signals for the next-decoded picture signals, but also enlarged to the same picture size as the picture signals of the upper layer by a picture enlarging circuit 327 so as to be substantially used as prediction reference picture signals for the upper layer.
That is, the picture signals from the arithmetic unit 305 are outputted as playback picture signals of the lower layer, as described above, while being outputted to the frame memory 306 and supplied to the picture enlarging circuit 327. The picture enlarging circuit 327 enlarges the picture signals to the same size as the picture size of the upper layer in order to output the enlarged picture signals to a weighting addition circuit 328.
The weighting addition circuit 328 multiplies the picture signal from the picture enlarging circuit 327 with a weight (1-W) to output the resulting multiplied signal to an arithmetic unit 317. The value (1-W) is derived from the decoded weight W.
The bitstream of the upper layer is supplied via reception buffer 309 to variable-length decoding circuit 310 where encoded data are decoded, that is, the quantization scale, motion vector, prediction mode and the weighting coefficients are decoded, as are the quantized data. The quantized data, variable-length decoded by the variable-length decoding circuit 310, are inverse-quantized by an inverse quantization circuit 311, using the similarly decoded quantization scale, so as to be outputted as DCT coefficients to an inverse DCT circuit 312, which then processes the DCT coefficients with IDCT to output an output signal (picture signal or difference signal) to an arithmetic unit 313.
A motion compensation circuit 315 generates prediction reference picture signals in accordance with the decoded motion vector and prediction mode to enter the prediction reference picture signals to a weighting addition circuit 316, which then multiplies the prediction reference picture signals from the motion compensation circuit 315 with the decoded weight W. The resulting product, obtained by this multiplication, is outputted to the arithmetic unit 317.
The arithmetic unit 317 sums the picture signals of the weighting addition circuits 328 and 316 to output the resulting picture signals to the arithmetic unit 313. If the output signal from the IDCT circuit 312 is the difference signal, the arithmetic unit 313 sums the difference signal of the IDCT circuit 312 to the picture signals from the arithmetic unit 317 for restoration of the picture signals of the upper layer. If the output signal of the IDCT circuit 312 is the intra-coded macro-block, the picture signals from the IDCT circuit 312 are directly outputted. These picture signals are stored in the frame memory 314 so as to be used as a prediction reference picture for the subsequently decoded picture signals.
Although the foregoing description is made of processing of luminance signals, processing of chroma signals is similar to that for the luminance signals. It is noted however that, in this case, the motion vector which is that for luminance signals halved in both the vertical and horizontal directions is used.
Although the foregoing description is of the MPEG system, a variety of other high efficiency encoding systems for moving pictures have been standardized. For example, the ITU-I (International Telecommunication Union-Telecommunication Sector) provides the H, 261 or H. 263 system as the encoding system mainly for communication. Similarly to the MPEG system, this H. 261 or the H. 263 is basically the combination of the motion-compensated predictive coding and DCT transform coding, and uses similar encoders or decoders, despite the difference in details, such as the header information.
In MPEG2, the spatial scalability is already standardized, however, its encoding efficiency cannot be said to be optimum. Thus, with the MPEG4 system or other novel encoding systems, it is mandatory to improve the encoding efficiency for spatial scalability.
This spatial scalability in the MPEG2 system is explained in some detail. In this scalable encoding system, the lower layer is encoded as in MP@ML for the usual encoding system, that is MPEG2. The upper layer uses a picture of the lower layer at the same time point and a directly previously decoded picture of the same layer. The prediction mode for the lower layer is set completely independently of that for the upper layer. Thus, there are occasions wherein the information has been transmitted in the lower layer but is not used in the upper layer such that encoding is effectuated by prediction from the decoded picture of the upper layer. This is tantamount to independently transmitting the information which can be co-owned by the upper and lower layers.
It is therefore incumbent to reduce such redundancy in information transmission to improve the encoding efficiency.
On the other hand, the MPEG2 system cannot designate the encoding mode except on the macro-block basis. Although not objectionable when handling a picture of a generally uniform picture area, this feature of the MPEG2 system tends to lower the encoding efficiency in case of a sequence exhibiting a complex motion, or in case pictures of different properties, such as a still area and a moving area, are contained in one and the same macro-block.
It is therefore an object of the present invention to provide a picture signal encoding method and apparatus, a picture signal decoding method and apparatus and a recording medium whereby it is possible to improve the prediction and encoding efficiency in the spatial scalable encoding system.
In one aspect, the present invention provides a picture encoder for encoding picture signals of a lower hierarchy representing pre-set picture signals and picture signals of an upper hierarchy similarly representing pre-set picture signals. The picture encoder includes a first encoding unit for encoding the picture signals of the lower hierarchy using reference picture signals for outputting first pre-set encoded data, a second encoding unit for encoding the picture signals of the upper hierarchy using reference picture signals for outputting second pre-set encoded data, a first decoding unit for decoding the first encoded data for generating first reference picture signals, and a second decoding unit for decoding the second encoded data for generating second reference picture signals. The second encoding unit encodes the picture signals using third reference picture signals generated on adaptively switching between the first reference picture signals and the second reference picture signals on the pixel basis.
In another aspect, the present invention provides a picture encoding method for encoding picture signals of a lower hierarchy representing pre-set picture signals and picture signals of an upper hierarchy similarly representing pre-set picture signals. The encoding method includes a first encoding step of encoding the picture signals of the lower hierarchy using reference picture signals for outputting first encoded data, a second encoding step of encoding the picture signals of the upper hierarchy using reference picture signals for outputting second encoded data, a first decoding step of decoding the first encoded data for generating first reference picture signals and a second decoding step of decoding the second encoded data for generating second reference picture signals. The second encoding unit encodes the picture signals using third reference picture signals generated on adaptively switching between the first reference picture signals and the second reference picture signals on the pixel basis.
In a further aspect, the present invention provides a picture decoding device for receiving and decoding encoded data composed of encoded picture signals of a lower hierarchy and encoded picture signals of an upper hierarchy, the encoded picture signals of the lower hierarchy and the encoded picture signals of the upper hierarchy being signals encoded using respective reference picture signals. The picture decoding device includes a receiving unit fr receiving the encoded data, a first decoding unit for decoding the encoded picture signals of the lower hierarchy using reference picture signals for outputting decoded picture signals of the lower hierarchy, with the decoded picture signals of the lower hierarchy being used as first reference picture signals, and a second decoding unit for decoding the encoded picture signals of the upper hierarchy using reference picture signals for outputting decoded picture signals of the upper hierarchy, with the decoded picture signal s of the upper hierarchy being used as second reference picture signals. The second decoding unit decodes the picture signals using third reference picture signals generated on adaptively switching between the first reference picture signals and the second reference picture signals on the pixel basis.
In further aspect, the present invention provides a picture decoding method for receiving and decoding encoded data composed of encoded picture signals of a lower hierarchy and encoded picture signals of an upper hierarchy, the encoded picture signals of the lower hierarchy and the encoded picture signals of the upper hierarchy being signals encoded using respective reference picture signals. The picture decoding method includes a receiving step of receiving the encoded data, a first decoding step of decoding the encoded picture signals of the lower hierarchy using reference picture signals for outputting decoded picture signals of the lower hierarchy, with the decoded picture signals of the lower hierarchy being used as first reference picture signals, and a second decoding step of decoding the encoded picture signals of the upper hierarchy using reference picture signals for outputting decoded picture signals of the upper hierarchy, with the decoded picture signals of the upper hierarchy being used as second reference picture signals. The second decoding step decodes the picture signals using third reference picture signals generated on adaptively switching between the first reference picture signals and the second reference picture signals on the pixel basis.
In a further aspect, the present invention provides a recording medium decodable by a picture decoding device. The recording medium contains encoded data composed of encoded picture signals of a lower hierarchy and encoded picture signals of an upper hierarchy. The encoded data is data generated by a first encoding step of encoding the picture signals of the lower hierarchy using reference picture signals for outputting first encoded data, a second encoding step of encoding the picture signals of the upper hierarchy using reference picture signals for outputting second encoded data, a first decoding step of decoding the first encoded data for generating first reference picture signals and a second decoding step of decoding the second encoded data for generating second reference picture signals. The second encoding step encodes the picture signals using third reference picture signals generated on adaptively switching between the first reference picture signals and the second reference picture signals on the pixel basis.
According to the present invention, in which picture signals are split into separate hierarchies and picture signals of the respective hierarchies are encoded using prediction reference pictures, signals obtained on encoding picture signals of the respective layers are decoded to generate reference pictures of respective layers, while pixels of the reference pictures are adaptively switched to generate reference pictures. In addition, pre-set units of the reference pictures are adaptively switched to generate pre-set reference pictures to realize a spatially scalable encoding method with improved prediction efficiency and encoding efficiency.