Video encoding is employed to convert an initial video sequence (a set of video images, also named pictures, or frames) into a corresponding encoded bitstream (a set of compressed video sequence binary data), and also converting video sequence binary data produced by a video codec system into a reconstructed video sequence (a decoded set of video images, or reconstructed frames). Most video coding standards are directed to provide the highest coding efficiency, which is the ability to encode a video sequence at the lowest bit rate while maintaining a certain level of video quality.
Most video sequences contain a significant amount of statistical and subjective redundancy within and between pictures that can be reduced by data compression techniques to make its size smaller. First the pictures in the video sequence are divided into blocks. The latest standard, the High Efficiency Video Coding (HEVC) uses blocks of up to 64×64 pixels and can sub-partition the picture into variable sized structures. HEVC initially divides a picture into coding tree units (CTUs), which are then divided into coding tree blocks (CTBs) for each luma/chroma component. The CTUs are further divided into coding units (CUs), which are then divided into prediction units (PUs) of either intra-picture or inter-picture prediction type. All modern video standards including HEVC use a hybrid approach to the video coding combining inter-/intra-picture prediction and 2D transform coding.
The intra-coding treats each picture individually, without reference to any other picture. HEVC specifies 33 directional modes for intra prediction, wherein the intra prediction modes use data from previously decoded neighboring prediction blocks. The prediction residual is the subject of Discrete Cosine Transform (DCT) and transform coefficient quantization.
The inter-coding is known to be used to exploit redundancy between moving pictures by using motion compensation (MC), which gives a higher compression factor than the intra-coding. According to known MC technique, successive pictures are compared and the shift of an area from one picture to the next is measured to produce motion vectors. Each block has its own motion vector which applies to the whole block. The vector from the previous picture is coded and vector differences are sent. Any discrepancies are eliminated by comparing the model with the actual picture. The codec sends the motion vectors and the discrepancies. The decoder does the inverse process, shifting the previous picture by the vectors and adding the discrepancies to produce the next picture. The quality of a reconstructed video sequence is measured as a total deviation of its pixels from the initial video sequence.
In common video coding standards like H.264 and HEVC (High Efficiency Video Coding) intra predictions for texture blocks include angular (directional) intra predictions and non-angular intra predictions (usually, in DC intra prediction mode and Planar prediction mode). Angular intra prediction modes use a certain angle in such a way that for texture prediction the data of the neighboring block pixels is propagated to the block interior at such angle. Due to the sufficient amount of possible intra prediction angles (e.g. 33 in HEVC specification) the procedure of choosing the optimal intra prediction may become very complex: the most simple way of the intra prediction mode selection is calculating all the possible intra predictions and choosing the best one by SAD (Sum of Absolute Difference), Hadamard SAD, or RD (Rate Distortion) optimization criterion.
However, the computational complexity of this exhaustive search method grows for a large number of possible prediction angles. To avoid an exhaustive search, an optimal intra prediction selection procedure is important in the video encoding algorithms. Moreover, the nature of the modern block-based video coding standards is that they admit a large variety of coding methods and parameters for each texture block formation and coding. Accommodating such a need requires selecting an optimal coding mode and parameters of video encoding.
The HEVC coding standard, however, extends the complexity of motion estimation, since the large target resolution requires a high memory bandwidth; large blocks (up to 64×64) require a large local memory; an 8-taps interpolation filter provides for a high complexity search of sub-pixel; and ½ and ¾ non-square block subdivisions require complex mode selection.