This technology described herein relates to the processing of video data, and in particular to methods of and apparatus for encoding video image data.
So that video images may be generated by one device (e.g. a video camera) and then transmitted (streamed) over a link (e.g. via the internet) to be viewed on another device (e.g. a computer), video image data (e.g. RGB or YUV values) is generally encoded for transmission in the format of an encoded bitstream, according to a predetermined video encoding format, such as HEVC, H264, VP8 and VP9. Video encoding formats can enable a significant reduction in the file size of video image data (which thus aids the efficient streaming of the video image data) without a significant visible loss of image quality when the video images are viewed.
Video image data is typically generated as a sequence of frames and these frames may be encoded in a number of different ways. To enable efficient encoding of the video image data, in “differential” video coding standards such as HEVC, H264, VP8 and VP9, the frames may be split up into smaller blocks which are then encoded relative to each other, e.g. to take into account differences between the blocks. For a frame of video image data being encoded, such “source” blocks may be encoded relative to corresponding “reference” blocks in other (e.g. previously encoded) frames in the sequence of frames (using an “inter-frame” encoding mode) or to other (e.g. previously encoded) blocks in the same frame (using an “intra-frame” encoding mode). The technology described herein relates to the latter, “intra-frame”, mode of encoding.
In an “intra-frame” mode of encoding, there are a number of different ways in which source blocks in a frame may be encoded relative to other (e.g. previously encoded) reference blocks in the same frame. This may be done by encoding the video image data associated with pixels in a block of a frame relative to the (e.g. already encoded) pixels at the edge of a neighbouring block, e.g. by determining the differences (“residuals”) between the data represented at the pixels in the block being encoded and the pixels at the edge of the neighbouring block. Different neighbouring edges (e.g. to the left or above the block being encoded), where available, may be used as a basis for performing the encoding. The pixels of these neighbouring edges may be projected into the block being encoded in different directions (e.g. vertically, horizontally or diagonally, where possible).
To determine the most cost effective way in which to encode a frame of video image data (e.g. that uses the least amount of data), a number of different ways of encoding a frame may be tested, and their respective “costs” determined. The different encoding options may comprise, for example, projecting the pixels from a neighbouring block in different directions into the block to be encoded in order to calculate the residuals in the block to be encoded and/or by using blocks of different sizes when determining the residuals. Once the costs of the different encoding options have been calculated, the encoding option that, e.g., minimises the overall cost of encoding that part or all of the frame of video image data may be chosen and (that part of) the frame encoded accordingly.
As will be appreciated, it can be expensive computationally to calculate multiple cost predictions for multiple different encoding options, such as block sizes and for different projection directions, to determine which is the most cost effective intra-frame encoding scheme to use. The Applicants believe that there remains scope for improved methods of and apparatus for estimating intra-frame encoding costs in video encoding.