Along with development of resolutions of televisions and displays into ultra high definition (4K) and extra ultra high definition (8K) and development and popularization of a new-generation cloud computing and information processing mode and platform adopting a remote desktop as a typical representation form, there is a requirement for applying video image data compression to a higher-resolution composite image including a computer screen image and an image shot by a camera. An ultra high compression rate and extremely high-quality data compression technology for a video image becomes indispensable.
Performing ultra high efficiency compression on a video image by fully utilizing characteristics of a 4K/8K image (or picture) and a computer screen image (or picture) is also a main objective of a latest international video compression standard High Efficiency Video Coding (HEVC) under formulation and a plurality of other international standards, national standards and industrial standards.
A natural form of a digital video signal is a sequence of images (or pictures). An image is usually a rectangular region formed by a plurality of pixels. A digital video signal, which is sometimes called as a video sequence or a sequence for short, is formed by dozens of and even tens of thousands of frames of images (or pictures). Coding a digital video signal is to code each image. At any time, the image which is being coded is called as a current coding image. Similarly, decoding a video bitstream, which is sometimes called as a bitstream or a stream for short, obtained by compressing the digital video signal is to decode a bitstream of each image. At any time, the image which is being decoded is called as a current decoding image. The current coding image or the current decoding image may be collectively called as a current image.
In almost all international standards for video image coding such as Moving Picture Experts Group (MPEG-1/2/4) H.264/Advanced Video Coding (AVC) and HEVC, when an image is being coded (and correspondingly being decoded), the image may be partitioned into a plurality of sub-images with M×M pixels, called as coding blocks (which are decoding blocks from the point of decoding, collectively called as coding and decoding blocks) or “Coding Units (CUs)”, and the blocks of the image are coded one by one by taking a CU as a basic coding unit. M may be usually 4, 8, 16, 32 or 64. Therefore, coding a video sequence is to sequentially code CUs of images one by one. At any time, a CU which is being coded is called as a current coding CU. Similarly, decoding a bitstream of a video image sequence is to sequentially decode CUs of images to finally reconstruct the whole video sequence. At any time, a CU which is being decoded is called as a current decoding CU. The current coding CU or the current decoding CU may be collectively called as a current CU.
In order to achieve adaptability to differences of contents and properties of different image parts in an image and pertinently and most effectively perform coding, sizes of different CUs in the image may be different, for example, some CUs may have a size of 8×8, while some CUs may have a size of 64×64. In order to seamlessly splice CUs with different sizes, an image may usually be partitioned into “Largest Coding Units (LCUs)” with completely the same size of, e.g., N×N pixels, at first, and then each LCU may be further partitioned into multiple tree-structured CUs of which sizes may not be the same. Therefore, the LCUs may also be called as “Coding Tree Units (CTUs)”. For example, an image may be partitioned into LCUs with completely the same size of, e.g., 64×64 pixels (N=64) at first. Among these LCUs, a certain LCU may be formed by three CUs with 32×32 pixels and four CUs with 16×16 pixels, and in such a manner, the seven tree-structured CUs may form a complete CTU. Another LCU may be formed by two CUs with 32×32 pixels, three CUs with 16×16 pixels and twenty CUs with 8×8 pixels, and in such a manner, the 25 tree-structured CUs may form a complete CTU. Coding an image is to sequentially code CUs in CTUs. In the international standard HEVC, LCU and CTU are synonyms. A CU of which a size is equal to that of a CTU is called as a CU with a depth 0. CUs obtained by equally partitioning a CU with the depth 0 into quarters, respectively being upper, lower, left and right parts of this CU, are called as CUs with a depth 1. CUs obtained by equally partitioning a CU with the depth 1 into quarters, respectively being upper, lower, left and right parts of this CU, are called as CUs with a depth 2. CUs obtained by equally partitioning a CU with the depth 2 into quarters, respectively being upper, lower, left and right parts of this CU, are called as CUs with a depth 3. The sub-regions may include, but not limited to, one or more Prediction Units (PUs), one or more Transform Units (TUs) and one or more Asymmetric Motion Partitioning (AMP) regions.
Pixel representation formats may include the following formats.
1) A colour pixel usually consists of three components. Two most common pixel colour formats include a Green, Blue and Red (GBR) colour format consisting of a green component, a blue component and a red component, and a YUV colour format, consisting of a luma component and two chroma components. Colour formats collectively called as YUV colour formats may actually include multiple colour formats, such as a YCbCr colour format. Therefore, when a CU is coded, one CU may be partitioned into three component planes (a G plane, a B plane and an R plane, or a Y plane, a U plane and a V plane). The three component planes may be coded respectively; alternatively, the three components of each pixel may be bundled and combined into a triple, and the CU formed by these triples may be coded in its entirety. The former pixel and component arrangement manner is called as a planar format of an image (and its CUs), while the latter pixel and component arrangement manner is called as a packed format of the image (and its CUs). A GBR colour format and a YUV colour format of a pixel are both three-component representation formats of the pixel.
2) Besides a three-component representation format of a pixel, another common representation format of the pixel is a palette index representation format. In the palette index representation format, a numerical value of one pixel may be represented by an index of a palette. Numerical values or approximate numerical values of three components of the pixel to be represented are stored in a palette space, and an address in the palette is called as an index of the pixel stored in the address. One index may represent one component of a pixel, and one index may alternatively represent three components of a pixel. There may be one or more palettes. Under the condition that there are multiple palettes, a complete index may be formed by two parts, i.e. a palette number and an index of the palette with the palette number. An index representation format of a pixel is to represent the pixel with an index. The index representation format of the pixel is also called as an indexed color or pseudo color representation format of the pixel, or is usually directly called as an indexed pixel or a pseudo pixel or a pixel index or an index. An index may also be called as an index number sometimes. Representing a pixel in an index representation format may also be called as indexing or indexation.
3) Other common pixel representation formats include a CMYK representation format and a grayscale representation format.
According to whether to perform down-sampling on a chroma component or not, a YUV colour format may also be subdivided into a plurality of sub-formats, for example, a YUV4:4:4 pixel colour format under which one pixel is formed by one Y component, one U component and one V component; a YUV4:2:2 pixel colour format under which two horizontally adjacent pixels are formed by two Y components, one U component and one V component; and a YUV4:2:0 pixel colour format under which four horizontally and vertically adjacent pixels arranged according to 2×2 spatial positions are formed by four Y components, one U component and one V component. One component is usually represented by a number represented by 8-16 bits. The YUV4:2:2 pixel colour format and the YUV4:2:0 pixel colour format are both obtained by executing down-sampling of chroma component on the YUV4:4:4 pixel colour format. One pixel component may also be called as one pixel sample, or may be simply called as one sample.
A most basic element during coding or decoding may be one pixel, may alternatively be one pixel component, and may alternatively be one pixel index (i.e. indexed pixel). One pixel or pixel component or indexed pixel adopted as the most basic element for coding or decoding may collectively be called as one pixel sample, and sometimes may also be collectively called as one pixel value or simply called as one sample.
An outstanding characteristic of a computer screen image is that there may usually be many similar and even completely the same pixel patterns in the same image. For example, a Chinese or foreign character frequently appearing in a computer screen image may be formed by a few basic strokes, and many similar or the same strokes may be found in the same image. A common menu, icon and the like in a computer screen image may also have many similar or the same patterns. Therefore, a coding technique usually adopted for image and video compression may include the following copying techniques.
1) One copying technique is intraframe string copying, i.e. intraframe string matching or called as string matching or string copying or pixel string copying. During pixel string copying, a current coding block or current decoding block (called as a current block) may be partitioned into multiple pixel sample strings with variable lengths. Here, the string may refer to arranging pixel samples in a two-dimensional region in any shape into a string of which a length is far larger than a width (for example, a string of which a width is one pixel sample while a length is 37 pixel samples; or a string of which a width is two pixel samples while a length is 111 pixel samples, usually under, but not limited to, the condition that the length is an independent coding or decoding parameter while the width is a parameter which is predetermined or derived from another coding or decoding parameter). A basic operation of string copying coding or decoding is to copy a reference string from a reconstructed reference pixel sample set for each coding string or decoding string (called as a current string for short) in the current block and assign a numerical value of the reference string to the current string. A copying parameter of the string copying technique may include: a displacement vector of the current string, which indicates a relative position between the reference string and the current string; and a copying length, i.e. copying size, of the current string which indicates the length, i.e., the number of pixel samples, of the current string. The length of the current string is equal to a length of the reference string. One current string has one displacement vector and one copying length. The number of displacement vectors and the number of copying lengths are equal to the number of strings into which a current block is partitioned.
2) Another copying technique is palette index copying, i.e. palette or called as index copying. In palette coding and corresponding decoding process, one palette is constructed or acquired at first, then part or all of pixels of a current coding block or current decoding block (called as a current block for short) are represented with an index of the palette, and then the index is coded and decoded. The index may be coded or decoded in, but not limited to, the following manner. An index of a current block may be partitioned into multiple variable-length index strings for index string copying coding and decoding. A basic operation of index string copying coding and decoding is to copy a reference index string from an indexed reconstructed reference pixel sample set for each index coding string or index decoding string (called as a current index string for short) in the current block and assign an index numerical value of the reference index string to a current index string. A copying parameter of the index string copying technique may include a displacement vector of the current index string, which indicates a relative position between the reference index string and the current index string; and a copying length, i.e. copying size, of the current index string, which indicates the length, i.e. the number of corresponding pixel samples, of the current index string. The length of the current index string is equal to a length of the reference index string. One current index string has one displacement vector and one copying length. The number of displacement vectors and the number of copying lengths are equal to the number of index strings into which a current block is partitioned.
3) Still another copying technique is a mixed copying technique mixing pixel string copying and index copying. When a current coding block or current decoding block (called as a current block for short) is being coded or decoded, a pixel string copying technique may be adopted for part or all of pixels, and an index copying technique may be adopted for part or all of the pixels.
4) Other copying techniques further include a block copying technique, a micro-block copying technique, a strip copying technique, a rectangular copying technique, a mixed copying technique mixing a plurality of copying techniques, and the like.
Here, a block in the block copying technique, a micro-block in the micro-block copying technique, a strip in the strip copying technique, a string in the string copying technique, a rectangle in the rectangular copying technique and a pixel index string in the palette index manner may be collectively called as pixel sample segments, or called as sample segments for short. A basic element of a sample segment may be a pixel or a pixel component or a pixel index. One sample segment has one copying parameter for representing a relationship between a current pixel sample segment and a reference pixel sample segment. One copying parameter may include a plurality of copying parameter components. The copying parameter components may at least include: a displacement vector horizontal component, a displacement vector vertical component, a copying length, a copying width, a copying height, a rectangle width, a rectangle length and an unmatched pixel (also called as a reference-free pixel, i.e. a non-copying pixel which is not copied from another place).
FIG. 1 shows an exemplary scanning manner. At present, during scanning, a complete row (or column) is scanned, and after the complete row (or column) is scanned, a next row (or column) is scanned. Therefore, in a related coding/decoding technology, an image may be scanned only in a fixed manner at present, which may greatly influence image coding compression efficiency and image decoding decompression efficiency.