To efficiently transmit or record video, a video coding device (image coding device) that generates coded data by coding video, and a video decoding device (image decoding device) that generates a decoded image by decoding the coded data are used.
Specific examples of a video coding scheme include H.264/MPEG-4.AVC, a scheme adopted for KTA software, which is a codec for joint development in VCEG (Video Coding Expert Group), a scheme adopted for TMuC (Test Model under Consideration) software, and a scheme suggested in HEVC (High-Efficiency Video Coding), which is a succeeding codec of the foregoing schemes (NPL 1).
In these video coding schemes, images (pictures) that form video are managed using a hierarchical structure, which is made up of slices obtained by dividing an image, coding units obtained by dividing a slice, and blocks and partitions obtained by dividing a coding unit, and are generally coded/decoded in units of blocks.
In these coding schemes, in usual cases, a prediction image is generated on the basis of a locally decoded image obtained by coding and decoding an input image, transform coefficients are obtained by performing frequency transform, such as DCT (Discrete Cosine Transform), on a difference image (also referred to as a “residual image” or “prediction residual”) representing a difference between the prediction image and the input image in units of blocks, and the transform coefficients are coded.
As specific examples of a scheme of coding transform coefficients, context-based adaptive variable length coding (CAVLC) and context-based adaptive binary arithmetic coding (CABAC) are known.
In CALVC, individual transform coefficients are sequentially scanned to generate one-dimensional vectors, and then syntax elements representing the values of the individual transform coefficients, a syntax element representing the length of consecutive zeros (also referred to as a “run length”), and so forth are coded.
In CABAC, a binarization process is performed on various syntax elements representing transform coefficients, and binary data obtained through the binarization process is arithmetically coded. Here, the various syntax elements include a flag indicating whether or not a transform coefficient is equal to 0, that is, a flag significant_coeff_flag indicating the presence/absence of a non-zero transform coefficient (also referred to as a transform coefficient presence/absence flag), and syntax elements last_significant_coeff_x and last_significant_coeff_y indicating the position of the last non-zero transform coefficient in processing order.
In CABAC, in the case of coding one symbol (1 bit of binary data, also referred to as Bin), a context index assigned to a target frequency component to be processed is referred to, and arithmetic coding is performed in accordance with the probability of occurrence indicated by a probability state index included in the context variable designated by the context index. Also, the probability of occurrence designated by the probability state index is updated every time a symbol is coded.
NPL 1 describes, for example, a technique of (1) dividing a frequency region related to a target block to be processed into a plurality of partial regions, (2) assigning, to frequency components included in a partial region on a low-frequency side, context indices (also referred to as position contexts) that are determined in accordance with the positions of the frequency components in the frequency region, and (3) assigning, to frequency components included in a partial region on a high-frequency side, context indices (also referred to as neighbor reference contexts) that are determined in accordance with the number of non-zero transform coefficients in frequency components around each of the frequency components.
NPLs 2 and 3 suggest reduction of the number of context indices.
NPL 4 suggests an improvement for scan order of various syntax elements.
NPL 5 suggests division of a frequency region related to a target block to be processed into a plurality of sub-blocks, and decoding of a flag indicating whether or not each sub-block includes a non-zero transform coefficient.
NPL 6 describes a technique of, for example, in a case where the size of a target block to be processed is a certain size or larger, performing the following steps (1) to (5) to derive context indices that are to be referred to when a transform coefficient presence/absence flag (significant_coeff_flag) is decoded (coded).
(1) Divide a frequency region of a target block to be processed into a plurality of partial regions. Also, perform the following steps (2) to (4) in accordance with whether each of the plurality of partial regions obtained through division is on any of a low-frequency side to a high-frequency side.
(2) For frequency components included in a partial region on the low-frequency side, derive context indices (also referred to as position contexts) that are determined in accordance with the positions of the frequency components in the frequency region.
(3) For frequency components included in a partial region in an intermediate-frequency region, derive context indices (also referred to as neighbor reference contexts) that are determined in accordance with the number of non-zero coefficients in frequency components around each of the frequency components.
(4) For frequency components included in a partial region on the high-frequency side, derive fixed context indices.
(5) In a case where the size of a target block to be processed is a certain size or smaller, derive context indices (also referred to as position contexts) that are determined in accordance with the positions of the frequency components in the frequency region.