Multiple video compression methods may be used to compress video data in order to minimize bandwidth required for transmitting the video data. The video compression methods include intra-frame compression and inter-frame compression. At present, an inter-frame compression method based on motion estimation is often used. Specifically, a process in which a coding end of an image uses the inter-frame compression method to compress and code the image includes: splitting, by the coding end, a to-be-coded image block into several image sub-blocks of a same size; for each image sub-block, searching a reference image for an image block, which best matches a current image sub-block, as a prediction block; subtracting a pixel value of the prediction block from a corresponding pixel value of the image sub-block to obtain a residual; performing entropy coding on a value obtained after the residual is transformed and quantified; and finally sending, to a decoding end, a bit stream and motion vector information obtained from the entropy coding, where the motion vector information indicates a location difference between the current image sub-block and the prediction block. The decoding end of the image first performs, after obtaining the bit stream from the entropy coding, entropy decoding to obtain the corresponding residual and the corresponding motion vector information; obtains the corresponding matched image block (that is, the prediction block) from the reference image according to the motion vector information; and then adds a value of each pixel point in the matched image block and a value of a corresponding pixel point in the residual to obtain a value of each pixel point in the current image sub-block. Intra-frame prediction is to use information inside a current image to predict an image block to obtain a prediction block. A coding end obtains a pixel corresponding to the prediction block according to a prediction mode, a prediction direction, and pixel values around the image block, and subtracts the pixel of the prediction block from a pixel of the image block to obtain a residual, where the residual is written into a code stream after undergoing transform, quantification, and entropy coding; and a decoding end parses the code stream, obtains a residual block after performing entropy decoding, de-quantification, and de-transform on the code stream, obtains the prediction block according to the prediction mode, the prediction direction, and the pixel values around the image block, and adds a pixel of the residual block and the pixel of the prediction block to obtain a reconstructed image block.
Concepts of a coding unit (coding unit), a prediction unit (prediction unit), and a transform unit (transform unit) exist in an existing video coding and decoding standard. The coding unit is an image block operated during coding at the coding end or decoding at the decoding end. The prediction unit is an image block that has an independent prediction mode in the coding unit. A prediction block is an image block operated during prediction of the coding unit, and one prediction unit may contain multiple prediction blocks. The transform unit is an image block operated during transform of the coding unit, where in this case, the image block may also be called a transform block. Considering that difference signals inside a prediction block are strongly correlated, large-block transform brings higher energy concentration performance than small-block transform. In a broader sense, one image block may contain one or more prediction blocks, and prediction is performed by using a prediction block as a unit at the coding and decoding ends; and one image block contains one or more transform blocks, and prediction is performed by using a transform block as a unit at the coding and decoding ends.
In the existing video coding and decoding standard such as the moving picture experts group (Moving Picture Experts Group, MPEG) or H.264/AVC (Advanced Video Coding, Advanced Video Coding), one image block, called a macroblock (macroblock) or image block, a super-macroblock (super-macroblock) or super image block, or the like, is split into several image sub-blocks. Sizes of these image sub-blocks may be 64×64, 64×32, 32×64, 32×32, 32×16, 16×32, 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4, and the like. The preceding motion estimation and motion compensation are performed for the image sub-blocks by using these sizes. A coding end of an image needs to send a code word that identifies a splitting manner of the image block to a decoding end of the image, so that the decoding end of the image learns a splitting manner used at the coding end of the image end and determines, according to the splitting manner and motion vector information, a corresponding prediction block. In the existing video coding and decoding standard, each of these image sub-blocks is an N×M (both N and M are an integer greater than 0) rectangular block, and N and M are in a multiple relationship.
In an existing video coding and decoding technology, a transform matrix may be used to remove correlation of a residual of an image block, that is, to remove redundant information of the image block so as to improve coding efficiency. Generally two-dimensional transform is used for transform of a data block in an image block. That is, the coding end multiplies residual information of the data block by one N×M transform matrix and a transpose matrix of the N×M transform matrix to obtain a transform coefficient. The preceding step may be described by using the following formula:f=T′×C×T 
where C represents residual information of a data block, T and T′ represent a transform matrix and a transpose matrix of the transform matrix, and f represents a transform coefficient matrix obtained after the residual information of the data block is transformed. The transform matrix may be a discrete cosine transform (Discrete Cosine Transform, DCT) matrix, an integer transform (Integer Transform) matrix, a KL transform (Karhunen Loeve Transform, KLT) matrix, or the like. KLT can better consider texture information of an image block or an image block residual and therefore using KLT may achieve a better effect.
Performing the preceding processing on residual information of an image block is equivalent to transforming the residual information of the image block from a space domain to a frequency domain, and the transform coefficient matrix f obtained after the transform is concentrated in a low-frequency area. After performing the preceding transform of the residual information of the image block, the coding end performs processing such as quantification and entropy coding on the transform coefficient matrix obtained after the transform, and sends, to the decoding end, a bit stream obtained from the entropy coding. To make the decoding end learn a type and a size of a transform matrix used at the coding end, generally the coding end sends, to the decoding end, indication information that indicates a transform matrix used by a current image block.
Subsequently the decoding end determines, according to the indication information, the transform matrix used at the coding end; decodes, according to a characteristic (such as orthogonality of the transform matrix) of the transform matrix, the bit stream sent by the coding end, to obtain the transform coefficient matrix; multiplies the transform coefficient matrix by the transform matrix and the transpose matrix of the transform matrix, to restore and obtain residual information of a data block that is approximately consistent with that at the decoding end. The preceding step may be described by using the following formula:C=T×f×T′
where C represents residual information of a data block, T and T′ represent a transform matrix and a transpose matrix of the transform matrix, and f represents a transform coefficient matrix obtained by the decoding end.
Because different regularities of distribution may exist for a residual of an image block, a good transform effect often cannot be achieved by using a transform matrix of a specific size. Therefore, in the prior art, transform matrices (also called transform blocks) of different sizes are used for the residual of the image block. For this reason, for a 2N×2N image block, a transform matrix whose size is 2N×2N may be used, or transform matrices whose sizes are N×N or transform matrices whose sizes are 0.5N×0.5N may be used.
Currently, however, only a transform matrix of a square size is used. For striped texture that frequently appears, a transform matrix of a square size cannot effectively remove redundant information of an image block, thereby lowering image compression efficiency.