A popular and effective video coding method, often referred to as a block-based hybrid encoding method, involves using block-based temporal prediction and transform coding. This method is essentially the core of all the international video coding standards.
In a block-based hybrid video encoder, each picture is divided into blocks. Each block in an inter picture is coded using a combination of motion-compensated prediction and transform coding. While the motion-compensated prediction removes the temporal correlation, the transform coding further de-correlates the signals in the spatial domain and compacts the energy into a few coefficients.
Different transforms have been developed for various international standards. In the standards prior to the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”), the transform block size has typically been 8×8. The 8×8 transform size has the advantages of being dyadic, large enough to capture trends and periodicities while being small enough to minimize spreading effects due to local transients over the transform area.
In the MPEG-4 AVC Standard, three additional transforms are also available for use as follows:                A Hadamard transform for the 4×4 array of luma DC coefficients in intra macroblocks predicted in 16×16 mode;        A Hadamard transform for the 2×2 array of chroma DC coefficients in any macroblock; and        A 4×4 DCT-based transform.        
This smaller size of 4×4 enables the encoder to better adapt the prediction error coding to the boundaries of moving objects, to match the transform block size with the smallest blocks size of the motion compensation, and to generally better adapt the transform to the local prediction error signal.
To encode the residual data of the luma component for inter frames, the encoder chooses from 8×8 and 4×4 transforms. In contrast, the transform is fixed for the chroma component as follows: a 4×4 transform cascaded with a 2×2 Hadamard transform for DC coefficients.
Transforms for Luma and Chroma Residue for Inter Pictures in KTA
VCEG “key technical area” (KTA) software has provided a common platform to integrate the new advances in video coding after the finalization of the MPEG-4 AVC Standard. The proposals to use extended block sizes and large transforms were adopted into KTA. In the current KTA software, motion partitions larger than 16×16 pixels are implemented. In particular, macroblocks of sizes 64×64, 64×32, 32×64, 32×32, 32×16, 16×32 are used in addition to the existing MPEG-4 AVC Standard partitioning sizes. Larger block transforms are also used to better capture the usually smoother content in high-definition video. The larger block transforms include 16×16, 16×8, and 8×16 transforms. Note that all these new transforms are applied to the luma components. The transform for the chroma component is the same as the MPEG-4 AVC Standard, which is a cascaded 4×4 transform. Such a fixed transform does not consider the characteristics of the video content.
Typical Chroma Transform
Turning to FIG. 1, a conventional method for chroma encoding in a video encoder is indicated generally by the reference numeral 100. The method 100 includes a start block 110 that passes control to a loop limit block 120. The loop limit block 120 begins a first loop (Loop (1)), using a variable j having a range from 1, . . . , number (#) of pictures (e.g., in an input video sequence), and passes control to a loop limit block 130. The loop limit block 130 begins a first loop (Loop (2)), using a variable i having a range from 1, . . . , number (#) of blocks (e.g., in a current picture of the input video sequence), and passes control to a function block 140. The function block 140 encodes the luma components for block i in picture j, and passes control to a function block 150. The function block 150 encodes the chroma components for block i in picture j with a fixed transform, and passes control to a loop limit block 160. The loop limit block 160 ends the Loop (2), and passes control to a function block 170. The function block 170 ends the Loop (1), and passes control to an end block 199.
Turning to FIG. 2, a conventional method for chroma decoding in a video decoder is indicated generally by the reference numeral 200. The method 200 includes a start block 210 that passes control to a loop limit block 220. The loop limit block 220 begins a first loop (Loop (1)), using a variable j having a range from 1, . . . , number (#) of pictures (e.g., in an input video sequence), and passes control to a loop limit block 230. The loop limit block 230 begins a first loop (Loop (2)), using a variable i having a range from 1, . . . , number (#) of blocks (e.g., in a current picture of the input video sequence), and passes control to a function block 240. The function block 240 decodes the luma components for block i in picture j, and passes control to a function block 250. The function block 250 decodes the chroma components for block i in picture j with a fixed transform, and passes control to a loop limit block 260. The loop limit block 260 ends the Loop (2), and passes control to a function block 270. The function block 270 ends the Loop (1), and passes control to an end block 299.
Hence, regarding method 100, as noted above, the transform is fixed for each block and each picture for chroma, while the luma component may use adaptive transforms. As part of method 100, the luma component of the block is encoded, possibly with an adaptively chosen transform. However, the chroma component is always encoded and, hence, also decoded (e.g., by method 200), with a fixed transform. Disadvantageously, as noted above, such a fixed transform does not consider the characteristics of the video content.