ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/WG 11) published the H.265/HEVC (High Efficiency Video Coding) standard in 2013 (version 1) 2014 (version 2) 2015 (version 3) and 2016 (version 4). Since then they have been studying the potential need for standardization of future video coding technology with a compression capability that significantly exceeds that of the HEVC standard (including its extensions). The groups are working together on this exploration activity in a joint collaboration effort known as the Joint Video Exploration Team (JVET) to evaluate compression technology designs proposed by their experts in this area. A Joint Exploration Model (JEM) has been developed by JVET to explore the video coding technologies beyond the capability of HEVC, and the current latest version of JEM is JEM-7.0. As JEM software has shown significant improvement over HEVC reference software HM, a joint Call for Proposal on video compression with capability beyond HEVC was issued in October 2017. A new generation of video coding standard is being under development.
In HEVC, a coding tree unit (CTU) is split into coding units (CUs) by using a quadtree structure denoted as coding tree to adapt to various local characteristics. The decision whether to code a picture area using inter-picture (temporal) or intra-picture (spatial) prediction is made at the CU level. Each CU can be further split into one, two or four prediction units (PUs) according to the PU splitting type. Inside one PU, the same prediction process is applied and the relevant information is transmitted to the decoder on a PU basis. After obtaining the residual block by applying the prediction process based on the PU splitting type, a CU can be partitioned into transform units (TUs) according to another quadtree structure like the coding tree for the CU. One of key feature of the HEVC structure is that it has the multiple partition conceptions including CU, PU, and TU. In HEVC, a CU or a TU can only be square shape, while a PU may be square or rectangular shape for an inter predicted block. In later stage of HEVC some contributions proposed to allow rectangular shape PUs for intra prediction and transform. These proposals were not adopted to HEVC but extended to be used in JEM.
At picture boundary, HEVC imposes implicit quad-tree split so that a block will keep quad-tree splitting until the size fits the picture boundary.
Inspired by previous work, a Quad-tree-Binary-tree (QTBT) structure was developed to unify the concepts of the CU, PU and TU and supports more flexibility for CU partition shapes. In QTBT block structure, a CU can have either a square or rectangular shape. As shown in FIG. 1, a coding tree unit (CTU) is first partitioned by a quadtree structure. The quadtree leaf nodes are further partitioned by a binary tree structure. There are two splitting types, symmetric horizontal splitting and symmetric vertical splitting, in the binary tree splitting. The binary tree leaf nodes are called coding units (CUs), and that segmentation is used for prediction and transform processing without any further partitioning. This means that the CU, PU and TU have the same block size in the QTBT coding block structure. In JEM, a CU sometimes consists of coding blocks (CBs) of different colour components, e.g. one CU contains one luma CB and two chroma CBs in the case of P and B slices of the 4:2:0 chroma format and sometimes consists of a CB of a single component, e.g., one CU contains only one luma CB or just two chroma CBs in the case of I slices.
The following parameters are defined for the QTBT partitioning scheme.                CTU size: the root node size of a quadtree, the same concept as in HEVC        MaxQTDepth: the maximum allowed quad-tree depth        MinQTSize: the minimum allowed quadtree leaf node size        MaxBTSize: the maximum allowed binary tree root node size        MaxBTDepth: the maximum allowed binary tree depth        MinBTSize: the minimum allowed binary tree leaf node size        
In one example of the QTBT partitioning structure, the CTU size is set as 128×128 luma samples with two corresponding 64×64 blocks of chroma samples, the MinQTSize is set as 16×16, the MaxBTSize is set as 64×64, the MinBTSize (for both width and height) is set as 4×4, and the MaxBTDepth is set as 4. The quadtree partitioning is applied to the CTU first to generate quadtree leaf nodes. The quadtree leaf nodes may have a size from 16×16 (i.e., the MinQTSize) to 128×128 (i.e., the CTU size). If the leaf quadtree node is 128×128, it will not be further split by the binary tree since the size exceeds the MaxBTSize (i.e., 64×64). Otherwise, the leaf quadtree node could be further partitioned by the binary tree. Therefore, the quadtree leaf node is also the root node for the binary tree and it has the binary tree depth as 0. When the binary tree depth reaches MaxBTDepth (i.e., 4), no further splitting is considered. When the binary tree node has width equal to MinBTSize (i.e., 4), no further horizontal splitting is considered. Similarly, when the binary tree node has height equal to MinBTSize, no further vertical splitting is considered. The leaf nodes of the binary tree are further processed by prediction and transform processing without any further partitioning. In the JEM, the maximum CTU size is 256×256 luma samples.
FIG. 1 (left) illustrates an example of block partitioning by using QTBT, and FIG. 1 (right) illustrates the corresponding tree representation. The solid lines indicate quadtree splitting and dotted lines indicate binary tree splitting. In each splitting (i.e., non-leaf) node of the binary tree, one flag is signalled to indicate which splitting type (i.e., horizontal or vertical) is used, where 0 indicates horizontal splitting and 1 indicates vertical splitting. For the quadtree splitting, there is no need to indicate the splitting type since quadtree splitting always splits a block both horizontally and vertically to produce 4 sub-blocks with an equal size.
In addition, the QTBT scheme supports the ability for the luma and chroma to have a separate QTBT structure. Currently, for P and B slices, the luma and chroma CTBs in one CTU share the same QTBT structure. However, for I slices, the luma CTB is partitioned into CUs by a QTBT structure, and the chroma CTBs are partitioned into chroma CUs by another QTBT structure. This means that a CU in an I slice consists of a coding block of the luma component or coding blocks of two chroma components, and a CU in a P or B slice consists of coding blocks of all three colour components.
In HEVC, inter prediction for small blocks is restricted to reduce the memory access of motion compensation, such that bi-prediction is not supported for 4×8 and 8×4 blocks, and inter prediction is not supported for 4×4 blocks. In the QTBT of the JEM, these restrictions are removed.
Multi-type-tree (MTT) structure is a more flexible tree structure than QTBT. In MTT, tree types other than quad-tree (QT) and binary-tree (BT) are supported. A horizontal and vertical center-side ternary trees (TT) are introduced, as shown in FIG. 2(d) and FIG. 2(e), respectively.
FIG. 2(a) illustrates an example of a quad-tree partitioning. FIG. 2(b) illustrates an example of a vertical binary-tree partitioning. FIG. 2(c) illustrates an example of a horizontal binary-tree partitioning. FIG. 2(d) illustrates an example of a vertical center-side ternary tree partitioning. FIG. 2(e) illustrates an example of a horizontal center-side ternary tree partitioning.
There are two levels of trees, region tree (quad-tree) and prediction tree (binary-tree or ternary tree). A CTU is firstly partitioned by region tree (RT). A RT leaf may be further split with prediction tree (PT). A PT node may also be further split with PT until max PT depth is reached. After entering PT, RT (quad-tree) cannot be used anymore. A PT leaf is the basic coding unit. It is still called CU for convenience. A CU cannot be further split. Prediction and transform are both applied on CU in the same way as JEM-3 or QTBT.
Benefits of ternary tree partitioning may include that, as a complement to quad-tree and binary-tree partitioning, ternary tree partitioning can capture objects which locate in block center while quad-tree and binary-tree are always splitting along block center. Also, the width and height of the partitions of the proposed ternary trees are always power of 2 so that no additional transforms are needed.
The design of two-level tree is mainly motivated by complexity reduction. Theoretically, the complexity of traversing of a tree is TD, where T denotes the number of split types, and D is the depth of tree. With the design of two level tree and restrict the first level is quad-tree only (reduce the number of T at certain levels), the complexity is reduced a lot while keeping a reasonable performance.
To further improve the coding efficiency on top of QTBT, asymmetric binary tree (ABT) is proposed. As shown in FIG. 3, a coding unit with size S is divided into 2 sub-CU with sizes S/4 and 3×S/4, either in the horizontal or in the vertical direction. In practice the added available CU sizes are 12 and 24. In a further extended version of the tool CU sizes 6 and 48 may be allowed.
One major issue with this method is that it is inconvenient if width/height of a block is not a power of 2. For example, transforms with size like 12 and 24 need to be supported. Special handling may also be needed when splitting a block with width/height being not a power of 2.
Using a SplitToSquare tree type, a block is split into largest same-size square sub-blocks. That is, if the input block is a rectangular block with the size of 2M×2N (M≠N), after SplitToSquare, we will have 2M+N−2×min(M,N) sub-blocks whose size are 2min(M,N)×2min(M,N). If the input block is a square block, Split2ToSquare leads to four square same-size sub-blocks, which is the same as the quad-tree split. Basically, SplitToSquare may be used to replace the quad-tree split as it covers more cases.