The technology described herein relates to the processing of video data, and in particular to methods of and apparatus for encoding and decoding video image data.
Video image data (e.g. RGB or YUV values) is generally encoded and then decoded, e.g. after transmission in the format of an encoded bitstream, according to a predetermined video encoding format, such as VP9. Video encoding formats such as VP9 can enable a significant reduction in the file size of video image data without a significant visible loss of image quality.
In encoded video data, generally each video frame is divided into a plurality of blocks (typically rectangles) of pixels of the frame (in VP9 encoding the blocks may be different sizes within a given frame) and each block is encoded and then decoded individually. In “differential” video coding standards such as VP9, each block of pixels within the video image data is usually encoded with respect to other encoded data, e.g. a reference block from a reference frame (such as a corresponding encoded block of pixels in a reference frame). Each encoded data block would therefore usually comprise a vector value (the so-called “motion vector”) pointing to the data for the reference frame and data (the “residual”) describing the differences between the data encoded in the current data block and the reference encoded data. (This thereby allows the video data for the block of the (current) frame to be constructed from the encoded video data pointed to by the motion vector and the difference data describing the differences between that block and the block of the current video frame.)
The data block may then be encoded by, for example, transforming the residuals into a set of coefficients (e.g. using an approximate Discrete Cosine Transform (DCT)) which are then quantised. Within a bitstream of multiple frames and even within a frame of multiple blocks, the encoding may be performed in a number of different ways or according to a set of variable encoding parameters, e.g. depending on the video image data in each block or relative to the reference encoded data.
In order that the encoded data is decoded correctly, an encoding indicator (such as the so-called “segment ID” in VP9 encoding) may be associated with each block when it is encoded that indicates how the block was encoded, i.e. so that the decoder knows how to decode the encoded data. The encoding indicator (e.g. the segment ID having an integer value from 0 to 7 in VP9 encoding) typically provides a reference to a predefined set of parameter values (e.g. including a quantisation parameter, a loop filter strength, a skip indication, etc.) that was used when encoding the block in question. Thus, when a frame of video image data is encoded, a set of encoding indicators (e.g. segment IDs), e.g. one for each block, can be produced, e.g. in the form of an encoding indicator map (a “segment map”), so they may be retrieved and used when decoding the encoded frame.
In video coding standards such as VP9, the encoding information, such as the motion vectors and the encoding indicators, for a given frame in a sequence of frames being decoded (or encoded) may, as well as being used to decode (or encode) the current frame (the frame to which they relate), also be used to determine (when determining) the encoding information for the next frame in the sequence of video frames. Thus, this information for the current frame may need to be available when decoding (or encoding) the next frame in the sequence.
Furthermore, in VP9 encoding, for example, for any given frame, one of a number of encoding or decoding modes can be applied, e.g. on a per frame basis, that determine whether or not a new set of the encoding indicators (e.g. a segment map) is generated (when encoding) or provided (when decoding) for the frame in question, or, e.g., whether the encoding indicators for a previous frame are to be used for the frame in question, or whether no encoding indicators are to be generated/provided for the frame in question (e.g. they may be disabled for the frame). Therefore a usable set of encoding indicators may not always be encoded with each frame (e.g. no encoding indicators or a default set of encoding indicators may be provided instead).
The Applicants believe that there remains scope for improvements to methods of and apparatus for encoding and decoding video image data.