When encoding data representing a sequence of visual images, data is often compressed in a manner so that only the differences between images is encoded into encoded image data to be used by a decoder when reconstructing the images, rather than encoding the data describing the entire image for each image. Thus, encoded data to be used for reconstruction of an image, or a part of an image, will include references to other images in the sequence of images, or to other parts of the currently reconstructed image. Such references could for example include instructions on how to spatially translate parts of a previous image to obtain parts of the currently reconstructed image (referred to as Inter Prediction based coding), or instructions on how to alter a known part of the current image to obtain an unknown part of the currently reconstructed image (referred to as Intra Prediction based coding), etc. The image resulting from having followed such instructions will here be referred to as the prediction image, while the instructions on how to obtain the prediction image from already decoded information will be referred to as the prediction parameters (PP). The prediction parameters are provided to a decoder, while the prediction image is not.
Prediction parameters by which an image could be exactly predicted can oftentimes not be efficiently provided. In order to still arrive at an acceptable degree of compression, a prediction image, which is not an exact copy of the original image, is therefore typically accepted. In order to further improve the decoded image, a representation of the prediction error is often included in the encoded image data. A decoder can thus use information on the prediction error to improve a predicted image that has been obtained by use of the prediction parameters.
Typically, a visual image is divided into a number of blocks, where a prediction and information on the prediction error is encoded for each block. Such a block includes a suitable number of samples or pixels, for example 4×4, 8×8, 16×16, 4×8, or any other suitable number of pixels.
The prediction error can for example be represented by a residual block, where the residual block describes differences between the original block and the prediction block in a pixel-wise manner: For a pixel coordinate (i, j), the residual block element RB(i, j) is often defined as the difference between the original block element OB(i, j) and the prediction block element PB(i, j): RB(i, j)=OB(i, j)−PB(i, j). The elements of an original block and a prediction block, and therefore also the elements of a residual block, typically represent the same time instant.
In order to exploit any remaining correlation between different samples in a residual block, spatial transforms are often applied on a residual block as part of the encoding procedure. Such application of a spatial transform will result in a transformed block (TB) comprising transformed coefficients, here referred to as TB coefficients. This transformed block, together with the prediction block and information on previous images, can be used to reconstruct an exact copy of the original image block. However, the representation of the transformed block often requires a large amount of bits, and the TB coefficients are therefore typically quantized and entropy coded as part of the encoding procedure. An example of a standard for encoding of audiovisual services which uses a spatial transform of residual blocks is the ITU-T standard H.264 of March 2009, “Advanced video coding for generic audiovisual services”. The H.264 standard, like many other encoding standards, uses for example the Discrete Cosine Transform (DCT).
The compression of media typically involves a trade-off between the degree of compression, the amount of distortion introduced by the compression and the computational resources required to compress and/or reconstruct the media. A high degree of compression will result in more efficient storage of the compressed media, as well as a smaller bandwidth requirement upon transmission of the media from the encoder to a decoder. However, a higher degree of compression often has the drawback of an increased amount of distortion, and/or an increase in the amount of computational resources required upon compression/reconstruction of the media.
Some spatial transforms, like the DCT, will, when applied to a signal wherein the samples are highly correlated, result in transformed blocks that can be more efficiently encoded than when the signal samples are less correlated. Thus, for highly correlated signals, an efficient compression can be obtained with a low amount of distortion. For poorly or negatively correlated signals, however, an accurate representation of the transformed block obtained by such spatial transforms typically requires a large amount of bits, thus reducing the compression efficiency (or the representation accuracy) of the encoding scheme. When high performance prediction tools are used, the elements of the residual block are often poorly or negatively correlated. In “Integer Sine Transform for Inter Frame”, ITU-T SG16/Q6, San Diego, Oct. 8-10, 2008, it has therefore been suggested that a transform coding scheme be employed wherein a selection is made between the integer cosine transform (ICT) and an integer sine transform (IST). The IST is more suitable for transformation of poorly correlated signals than the DCT or the ICT.
Although a transform coding scheme wherein one of the ICT and IST is selected depending on the correlation of the elements in a residual block may in some circumstances improve the degree of compression with maintained representation accuracy, such a scheme would require that both transforms be defined and implemented at both encoder and decoder. This typically increases the hardware requirements on the encoder and the decoder, making hardware optimization difficult. Typically, the required hardware will become both larger and more expensive.