Many applications for video coding currently exist, including applications for transmission and storage of video data. Many video coding standards have also been developed and others are currently in development. Recent developments in video coding standardisation have led to the formation of a group called the “Joint Collaborative Team on Video Coding” (JCT-VC). The Joint Collaborative Team on Video Coding (JCT-VC) includes members of Study Group 16, Question 6 (SG16/Q6) of the Telecommunication Standardisation Sector (ITU-T) of the International Telecommunication Union (ITU), known as the Video Coding Experts Group (VCEG), and members of the International Organisations for Standardisation/International Electrotechnical Commission Joint Technical Committee 1/Subcommittee 29/Working Group 11 (ISO/IEC JTC1/SC29/WG11), also known as the Moving Picture Experts Group (MPEG).
The Joint Collaborative Team on Video Coding (JCT-VC) has the goal of producing a new video coding standard to significantly outperform a presently existing video coding standard, known as “H.264/MPEG-4 AVC”. The H.264/MPEG-4 AVC standard is itself a large improvement on previous video coding standards, such as MPEG-4 and ITU-T H.263. The new video coding standard under development has been named “high efficiency video coding (HEVC)”. The Joint Collaborative Team on Video Coding JCT-VC is also considering implementation challenges arising from technology proposed for high efficiency video coding (HEVC) that create difficulties when scaling implementations of the standard to operate at high resolutions or high frame rates.
One area of the H.264/MPEG-4 AVC video coding standard that presents difficulties for achieving high compression efficiency is the coding of residual coefficients used to represent video data. Video data is formed by a sequence of frames, with each frame having a two-dimensional array of samples. Typically, frames include one luminance and two chrominance channels. Each frame is decomposed into one or more slices. Each slice contains one or more largest coding units (LCUs). The largest coding units (LCUs) have a fixed size, with edge dimensions being a power of two and having equal width and height, such as 64 luma samples. One feature of the high efficiency video coding (HEVC) standard under development is “fine granularity slices”. When the fine granularity slices feature is enabled, slice boundaries are not restricted to the largest coding unit (LCU) boundaries. Fine granularity slices may be enabled at a bitstream level.
A coding tree enables the subdivision of each largest coding unit (LCU) into four equally-sized regions, each having half the width and height of a parent largest coding unit (LCU). Each of the regions may be further subdivided into four equally-sized regions. Where a region is not further sub-divided, a coding unit exists, occupying the entirety of the region. Such a subdivision process may be applied recursively until the size of a region is a smallest coding unit (SCU) size is reached and a coding unit (CU) the size of the smallest coding unit (SCU) is inferred. The recursive subdivision of a largest coding unit into a hierarchy of coding units has a quadtree structure and is referred to as the coding tree. Coding units (CUs) or regions have a property known as their ‘depth’, which refers to their position in the coding tree in terms of the level in the hierarchy of subdivisions. This subdivision process is encoded in the bitstream as a sequence of arithmetically coded flags. When fine granularity slices is enabled, a threshold is specified which determines the smallest size of coding unit at which a slice boundary may exist.
A set of coding units exist in the coding tree that are not further sub-divided, being those coding units that occupy the leaf nodes of the coding tree. Transform trees exist at these coding units. A transform tree may further decompose a coding unit using a quadtree structure as used for the coding tree. At the leaf nodes of the transform tree, residual data is encoded using transform units (TUs). In contrast to the coding tree, the transform tree may subdivide coding units into transform units having a non-square shape. Further, the transform tree structure does not require that transform units (TUs) occupy all of the area provided by the parent coding unit.
Each coding unit at the leaf nodes of the coding trees are subdivided into one or more arrays of predicted data samples, each known as a prediction unit (PU). Each prediction unit (PU) contains a prediction of a portion of the input frame data, derived by applying an intra-prediction process or an inter-prediction process. Several methods may be used for coding prediction units (PUs) within a coding unit (CU). A single prediction unit (PU) may occupy an entire area of the coding unit (CU), or the coding unit (CU) may be split into two equal-sized rectangular prediction units (PUs), either horizontally or vertically. Additionally, the coding units (CU) may be split into four equal-sized square prediction units (PUs).
A video encoder compresses the video data into a bitstream by converting the video data into a sequence of syntax elements. A context adaptive binary arithmetic coding (CABAC) scheme is defined within the high efficiency video coding (HEVC) standard under development, using an identical arithmetic coding scheme as to that defined in the MPEG4-AVC/H.264 video compression standard. In the high efficiency video coding (HEVC) standard under development, when context adaptive binary arithmetic coding (CABAC) is in use, each syntax element is expressed as a sequence of bins, where the bins are selected from a set of available bins. The set of available bins is obtained from a context model, with one context per bin. Each context holds a likely bin value (the ‘valMPS’) and a probability state for the arithmetic encoding or arithmetic decoding operation. Note that bins may also be bypass coded, where there is no association with a context. Bypass coded bins consume one bit in the bitstream and therefore are suited to bins with equal probability of being one-valued or zero-valued. Creating such a sequence of bins from a syntax element is known as “binarising” the syntax elements.
In a video encoder or video decoder, as separate context information is available for each bin, context selection for bins provides a means to improve coding efficiency. In particular, coding efficiency may be improved by selecting a particular bin such that statistical properties from previous instances of the bin, where the associated context information was used, correlate with statistical properties of a current instance of the bin. Such context selection frequently utilises spatially local information to determine the optimal context.
In the high efficiency video coding (HEVC) standard under development and in H.264/MPEG-4 AVC, a prediction for a current block is derived, based on reference sample data either from other frames, or from neighbouring regions within the current block that have been previously decoded. The difference between the prediction and the desired sample data is known as the residual. A frequency domain representation of the residual is a two-dimensional array of residual coefficients. By convention, the upper-left corner of the two-dimensional array contains residual coefficients representing low-frequency information.
One aspect of throughput of the high efficiency video coding (HEVC) standard under development relates to the ability to encode or decode video data at high bit-rates. The context adaptive binary arithmetic coding (CABAC) scheme employed in the high efficiency video coding (HEVC) standard under development supports an ‘equal probability’ mode of operation referred to as ‘bypass coding’. In this mode, the bin is not associated with a context from the context model, and so there is no context model update step. In this mode, it is possible to read multiple adjacent bins from the bitstream in parallel, provided each bin is bypass coded which increases throughput. For example, hardware implementations may write/read groups of adjacent bypass coded data in parallel to increase the throughput of encoding/decoding the bitstream.