ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/WG 11) published the H.265/HEVC (High Efficiency Video Coding) standard in 2013 (version 1) 2014 (version 2) 2015 (version 3) and 2016 (version 4). Since then they have been studying the potential need for standardization of future video coding technology with a compression capability that significantly exceeds that of the HEVC standard (including its extensions). The groups are working together on this exploration activity in a joint collaboration effort known as the Joint Video Exploration Team (JVET) to evaluate compression technology designs proposed by their experts in this area. A Joint Exploration Model (JEM) has been developed by JVET to explore the video coding technologies beyond the capability of HEVC, and the current latest version of JEM is JEM-7.0.
In HEVC, a coding tree unit (CTU) is split into coding units (CUs) by using a quadtree structure denoted as coding tree to adapt to various local characteristics. The decision whether to code a picture area using inter-picture (temporal) or intra-picture (spatial) prediction is made at the CU level. Each CU can be further split into one, two, or four prediction units (PUs) according to the PU splitting type. Inside one PU, the same prediction process is applied and the relevant information is transmitted to the decoder on a PU basis. After obtaining the residual block by applying the prediction process based on the PU splitting type, a CU can be partitioned into transform units (TUs) according to another quadtree structure similar to the coding tree for the CU. One of the key features of the HEVC structure is that it has the multiple partition conceptions including CU, PU, and TU. In HEVC, a CU or a TU can only be square shaped, while a PU may be square or rectangular shaped for an inter predicted block. In later stages of HEVC, some contributions proposed to allow rectangular shaped PUs for intra prediction and transformation. These proposals were not adopted to HEVC but extended to be used in JEM. At the picture boundary, HEVC imposes implicit quad-tree split so that a block will keep quad-tree splitting until the size fits the picture boundary.
Inspired by previous works, a Quad-tree-Binary-tree (QTBT) structure was developed and unifies the concepts of the CU, PU, and TU and supports more flexibility for CU partitioned shapes. In the QTBT block structure, a CU can have either a square or rectangular shape. A coding tree unit (CTU) is first partitioned by a quadtree structure. The quadtree leaf nodes are further partitioned by a binary tree structure. There are two splitting types, symmetric horizontal splitting and symmetric vertical splitting, in the binary tree splitting. The binary tree leaf nodes are called coding units (CUs), and that segmentation is used for prediction and transform processing without any further partitioning. This means that the CU, PU, and TU have the same block size in the QTBT coding block structure. In JEM, a CU sometimes consists of coding blocks (CBs) of different colour components, e.g., one CU contains one luma CB and two chroma CBs in the case of P and B slices of the 4:2:0 chroma format and sometimes consists of a CB of a single component, e.g., one CU contains only one luma CB or just two chroma CBs in the case of I slices.
The following parameters are defined for the QTBT partitioning scheme:                CTU size: the root node size of a quadtree, the same concept as in HEVC        MaxQTDepth: the maximum allowed quad-tree depth        MinQTSize: the minimum allowed quadtree leaf node size        MaxBTSize: the maximum allowed binary tree root node size        MaxBTDepth: the maximum allowed binary tree depth        MinBTSize: the minimum allowed binary tree leaf node size        
In one example of the QTBT partitioning structure, the CTU size is set as 128×128 luma samples with two corresponding 64×64 blocks of chroma samples, the MinQTSize is set as 16×16, the MaxBTSize is set as 64×64, the MinBTSize (for both width and height) is set as 4×4, and the MaxBTDepth is set as 4. The quadtree partitioning is applied to the CTU first to generate quadtree leaf nodes. The quadtree leaf nodes may have a size from 16×16 (i.e., the MinQTSize) to 128×128 (i.e., the CTU size). If the leaf quadtree node is 128×128, it will not be further split by the binary tree since the size exceeds the MaxBTSize (i.e., 64×64). Otherwise, the leaf quadtree node could be further partitioned by the binary tree. Therefore, the quadtree leaf node is also the root node for the binary tree and it has the binary tree depth as 0. When the binary tree depth reaches MaxBTDepth (i.e., 4), no further splitting is considered. When the binary tree node has width equal to MinBTSize (i.e., 4), no further horizontal splitting is considered. Similarly, when the binary tree node has height equal to MinBTSize, no further vertical splitting is considered. The leaf nodes of the binary tree are further processed by prediction and transform processing without any further partitioning. In the JEM, the maximum CTU size is 256×256 luma samples.
In addition, the QTBT scheme supports the ability for the luma and chroma to have a separate QTBT structure. Currently, for P and B slices, the luma and chroma CTBs in one CTU share the same QTBT structure. However, for I slices, the luma CTB is partitioned into CUs by a QTBT structure, and the chroma CTBs are partitioned into chroma CUs by another QTBT structure. This means that a CU in an I slice consists of a coding block of the luma component or coding blocks of two chroma components, and a CU in a P or B slice consists of coding blocks of all three colour components.
In HEVC, inter prediction for small blocks is restricted to reduce the memory access of motion compensation, such that bi-prediction is not supported for 4×8 and 8×4 blocks, and inter prediction is not supported for 4×4 blocks. In the QTBT of the JEM, these restrictions are removed.
Multi-type-tree (MTT) structure is a more flexible tree structure than QTBT. In MTT, tree types other than quad-tree and binary-tree are supported. For example, horizontal and vertical center-side triple-trees are introduce. Further, MTT supports (a) quad-tree partitioning, (b) vertical binary-tree partitioning, (c) horizontal binary-tree partitioning, (d) vertical center-side triple-tree partitioning, (e) horizontal center-side triple-tree partitioning, among other types.
There are two levels of trees, region tree (quad-tree) and prediction tree (binary-tree or triple-tree). A CTU is firstly partitioned by region tree (RT). A RT leaf may be further split with prediction tree (PT). A PT node may also be further split with PT until a max PT depth is reached. After entering PT, RT (quad-tree) cannot be used anymore. A PT leaf is the basic coding unit. It is still called CU for convenience. A CU cannot be further split. Prediction and transform are both applied on CU in the same way as JEM-3 or QTBT.
The key benefits of the triple-tree partitioning are to complement quad-tree and binary-tree partitioning: triple-tree partitioning is able to capture objects which are located in a block center, while quad-tree and binary-tree are always splitting along the block center. Further, the width and height of the partitions of the proposed ternary trees are always a power of two so that no additional transforms are needed.
The design of the two-level tree is mainly motivated by complexity reduction. Theoretically, the complexity of traversal of a tree is TD, where T denotes the number of split types, and D is the depth of tree. With the design of a two level tree and by restricting the first level to quad-tree only (e.g., reduce the number of Tat certain levels), the complexity is significantly while maintaining reasonable performance.
To further improve the coding efficiency on top of QTBT, an asymmetric binary tree is proposed. For example, a coding unit with size S is divided into 2 sub-CU with sizes S/4 and S/4, either in the horizontal or in the vertical direction. In practice the added available CU sizes are 12 and 24. In a further extended version of the tool, CU sizes 6 and 48 may be allowed.
One major issue with this method is that it is inconvenient if width/height of a block is not a power of two. For example, transforms with sizes such as 12 and 24 need to be supported. Special handling may also be needed when splitting a block with width/height other than a power of 2.