Most existing image and video-coding standards such as JPEG, H.264/AVC, VC-1, and the upcoming next generation video codec standard HEVC (High Efficiency Video Coding) employ block-based transform coding as a tool to efficiently compress the input image and video signals. The pixel domain data is transformed to frequency domain using a transform process on a block-by-block basis. For typical images, most of the energy is concentrated in the low-frequency transform coefficients. Following the transform, a bigger step-size quantizer can be used for higher-frequency transform coefficients in order to compact energy more efficiently and attain better compression. Hence, it is required to devise the optimal transform for each image block to fully de-correlate the transform coefficients. The Karhunen Loeve Transform (KLT) possesses several attractive properties, e.g., in high resolution quantization of Gaussian signals and full de-correlation of transform coefficients. However, practical use of KLT is limited due to its high computational complexity, and it has been shown in “Discrete cosine transform-algorithms, advantages and applications,” by K. R. Rao and P. Yip (1990), that the Discrete Cosine Transform (DCT) provides an attractive alternative to KLT in terms of energy compaction and performance close to KLT. But with the advent of intra-prediction, this is no longer the case and the optimal transform should be adaptive to intra-prediction mode.
In the ongoing standardization of HEVC, non-conventional transforms, in addition to the standard DCT are being investigated for intra-prediction residuals (Robert Cohen et. al., “Tool Experiment 7: MDDT Simplification”, ITU-T JCTVC-B307, Geneva, Switzerland, July 2010). These transforms can broadly be categorized into two classes: (a) training-based transforms and (b) model-based transforms. Prominent amongst the training based transforms is the Mode-Dependent Directional Transforms (MDDT) (Y. Ye and M. Karczewicz, “Improved Intra coding,” ITU-T Q.6/SG-16 VCEG, VCEG-AG11, Shenzhen, China, October 2007). In MDDT, a large training set of error residuals is collected for each intra-prediction mode and then the optimal transform matrix is computed using the residual training set. However, MDDT requires a large number of transform matrices—up to eighteen at block sizes N=4 and 8. The other class of model-based transform assumes the video signal to be modeled a first order Gauss-Markov process and then the optimal transform is derived analytically. These model based transforms require two transform matrices at a block size.
In J. Han, A. Saxena and K. Rose, “Towards jointly optimal spatial prediction and adaptive transform in video/image coding,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2010, pp. 726-729, a Discrete Sine Transform (DST) was analytically derived with frequency and phase components different from the conventional DCT for the first-order Gauss-Markov model, when the boundary information is available in one direction, as in intra-prediction in H.264/AVC (T. Wiegland, G. J. Sullivan, G. Bjontegaard and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, July 2003). They also showed that if prediction is not performed along a particular direction, then DCT performs close to KLT. The idea was applied to the vertical and horizontal modes in intra-prediction in H.264/AVC and a combination of the proposed DST and conventional DCT was used adaptively. Attempts have been made to extend similar ideas experimentally without a theoretical justification, by applying the combination of DST and DCT to other seven prediction modes in H.264/AVC, and showed that there is only a minor loss in performance in comparison to MDDT (C. Yeo, Y. H. Tan, Z. Li and S. Rahardja, “Mode-dependent fast separable KLT for block-based intra coding,” ITU-T JCTVC-B024, Geneva, Switzerland, July 2010).
Also, the DST matrices should be appropriately scaled to take into account the effect of quantization scaling matrices. The prior art does not describe modification of DST matrix coefficients to match the scaling to the DCT in the implementation in the HEVC.
Therefore, there is a need in the art for an improved video codec that improves the compression efficiency and utilizes a low complexity transform.