Technical Field
The present invention relates to a sampling filter process for scalable video coding. More specifically, the present invention relates to re-sampling using video data obtained from an encoder or decoder process, where the encoder or decoder process can be MPEG-4 Advanced Video Coding (AVC) or High Efficiency Video Coding (HEVC).
Related Art
An example of a scalable video coding system using two layers is shown in FIG. 1. In the system of FIG. 1, one of the two layers is the Base Layer (BL) where a BL video is encoded in an Encoder E0, labeled 100, and decoded in a decoder D0, labeled 102, to produce a base layer video output BL out. The BL video is typically at a lower quality than the remaining layers, such as the Full Resolution (FR) layer that receives an input FR (y). The FR layer includes an encoder E1, labeled 104, and a decoder D1, labeled 106. In encoding in encoder E1 104 of the full resolution video, cross-layer (CL) information from the BL encoder 100 is used to produce enhancement layer (EL) information. The corresponding EL bitstream of the full resolution layer is then decoded in decoder D1 106 using the CL information from decoder D0 102 of the BL to output full resolution video, FR out. By using CL information in a scalable video coding system, the encoded information can be transmitted more efficiently in the EL than if the FR was encoded independently without the CL information. An example of coding that can use two layers shown in FIG. 1 includes video coding using AVC and the Scalable Video Coding (SVC) extension of AVC, respectively. Another example that can use two layer coding is HEVC.
FIG. 1 further shows block 108 with a down-arrow r illustrating a resolution reduction from the FR to the BL to illustrate that the BL can be created by a downsampling of the FR layer data. Although a downsampling is shown by the arrow r of block 108 FIG. 1, the BL can be independently created without the downsampling process. Overall, the down arrow of block 108 illustrates that in spatial scalability, the base layer BL is typically at a lower spatial resolution than the full resolution FR layer. For example, when r=2 and the FR resolution is 3840×2160, the corresponding BL resolution is 1920×1080.
The cross-layer CL information provided from the BL to the FR layer shown in FIG. 1 illustrates that the CL information can be used in the coding of the FR video in the EL. In one example, the CL information includes pixel information derived from the encoding and decoding process of the BL. Examples of BL encoding and decoding are AVC and HEVC. Because the BL pictures are at a different spatial resolution than the FR pictures, a BL picture needs to be upsampled (or re-sampled) back to the FR picture resolution in order to generate a suitable prediction for the FR picture.
FIG. 2 illustrates an upsampling process in block 200 of data from the BL layer to the EL. The components of the upsampling block 200 can be included in either or both of the encoder E1 104 and the decoder D1 106 of the EL of the video coding system of FIG. 1. The BL data at resolution x that is input into upsampling block 200 in FIG. 2 is derived from one or more of the encoding and decoding processes of the BL. A BL picture is upsampled using the up-arrow r process of block 200 to generate the EL resolution output y′ that can be used as a basis for prediction of the original FR input y.
The upsampling block 200 works by interpolating from the BL data to recreate what is modified from the FR data. For instance, if every other pixel is dropped from the FR in block 108 to create the lower resolution BL data, the dropped pixels can be recreated using the upsampling block 200 by interpolation or other techniques to generate the EL resolution output y′ from upsampling block 200. The data y′ is then used to make encoding and decoding of the EL data more efficient.