Scalable Video Codec (SVC) encodes video into a sequence of pictures with the highest image quality while ensuring that part of the encoded picture sequence (specifically, a partial sequence of frames intermittently selected from the total sequence of frames) can be decoded and used to represent the video with a low image quality.
Although it is possible to represent low image-quality video by receiving and processing part of a sequence of pictures encoded according to the scalable scheme, there is still a problem in that the image quality is significantly reduced if the bitrate is lowered. One solution to this problem is to provide an auxiliary picture sequence for low bitrates, for example, a sequence of pictures having a small screen size and/or a low frame rate, as at least one layer in the hierarchical structure.
When it is assumed that two sequences are provided, the auxiliary (lower) picture sequence is referred to as a base layer, and the main (upper) picture sequence is referred to as an enhanced or enhancement layer. Video signals of the base and enhanced layers have redundancy since the same video signal source is encoded into two layers. To increase the coding efficiency of the enhanced layer, a video signal of the enhanced layer is coded using coded information (motion information or texture information) of the base layer.
While a single video source 1 may be coded into a plurality of layers with different transfer rates as shown in FIG. 1a, a plurality of video sources 2b in different scanning modes which contain the same content 2a may be coded into the respective layers as shown in FIG. 1b. Also in this case, an encoder which codes the upper layer can increase coding gain by performing interlayer prediction using coded information of the lower layer since both the sources 2b provide the same content 2a. 
Thus, it is necessary to provide a method for interlayer prediction taking into consideration the scanning modes of video signals when coding different sources into the respective layers. When interlaced video is coded, it may be coded into even and odd fields and may also be coded into pairs of odd and even macroblocks in one frame. Accordingly, the types of pictures for coding an interlaced video signal must also be considered for interlayer prediction.
Generally, the enhanced layer provides pictures with a resolution higher than those of the base layer. Accordingly, if pictures of the layers have different resolutions when different sources are coded into the respective layers, it is also necessary to perform interpolation to increase the picture resolution (i.e., picture size). Since the closer the images of base layer pictures for use in interlayer prediction are to those of enhanced layer pictures for predictive coding, the higher the coding rate is, it is necessary to provide a method for interpolation taking into consideration the scanning modes of the video signals of the layers.