In order to reduce the bandwidth required for transferring video data as much as possible, various video compression methods may be adopted to compress the video data, in which the video compression methods include: intra-frame compression and inter-frame compression. Currently, an inter-frame compression method based on motion estimation is mostly adopted, and specifically, a procedure in which an encoding end of a picture adopts the inter-frame compression method to compress and encode the picture includes the following steps: the encoding end divides a picture block to be encoded into several sub-picture blocks with an equal size, then for each sub-picture block, a reference picture is searched for a picture block, the most matched with the current sub-picture block, to be used as a prediction block, then corresponding pixel values of the sub-picture block and the prediction block are subtracted to obtain a residual, and entropy encoding is performed on a value obtained by transforming and quantizing the residual, and finally a bit stream obtained through the entropy encoding, together with motion vector information, is sent to a decoding end, in which, the motion vector information indicates a position difference between the current sub-picture block and the prediction block. At the decoding end of the picture, first the bit stream is obtained through the entropy encoding, then entropy decoding is performed on the bit stream, to obtain the corresponding residual, and the corresponding motion vector information; and then according to the motion vector information, a corresponding matched picture block (that is, the foregoing prediction block) is obtained in the reference picture, and then a value of each pixel point in the matched picture block is added to a value of a corresponding pixel point in the residual value to obtain a value of each pixel point in the current sub-picture block.
In existing video encoding and decoding standards, such as the moving picture experts group (Moving Picture Experts Group, MPEG), and the H.264/AVC (Advanced Video Coding, advanced video coding), a picture block, or referred to as a macroblock (macroblock) or super-macroblock (super-macroblock), is divided into several sub-picture blocks, sizes of these sub-picture blocks are 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4, the foregoing motion estimation and motion compensation are performed on the sub-picture blocks with these sizes, and the encoding end of the picture is required to send a code word for identifying a picture block division manner to the decoding end of the picture, so that the decoding end of the picture knows a division manner of the picture encoding end, and determines a corresponding prediction block according to the division manner and the motion vector information. In the existing video encoding and decoding standards, these sub-picture blocks are all rectangular blocks of N×M (N and M are both integers greater than 0), and N and M has a multiple relationship.
Common manners for dividing a picture block into sub-picture blocks include: a 2N×2N division manner, in which a picture block only includes a sub-picture block, that is, the picture block is not divided into a smaller sub-picture block, as shown in FIG. 1a; a 2N×N division manner, in which a picture block is divided into an upper sub-picture block and a lower sub-picture block with an equal size, as shown in FIG. 1b; an N×2N division manner, in which a picture block is divided into a left sub-picture block and a right sub-picture block with an equal size, as shown in FIG. 1c; and an N×N division manner, in which a picture block is divided into four sub-picture blocks with an equal size, as shown in FIG. 1d. The N is any positive integer.
Moreover, a picture block may further adopt an asymmetrical division manner, as shown in FIG. 2a to FIG. 2d. In division manners shown in FIG. 2a and FIG. 2b, a picture block is divided into an upper rectangular sub-picture block and a lower rectangular sub-picture block with unequal sizes; In two sub-picture blocks into which 2N×nU (in which n=0.5 N) shown in FIG. 2a is divided, lengths of two sides of the upper sub-picture block is 2N and 0.5N, and lengths of two sides of the lower sub-picture block are 2N and 1.5N, in which, U in 2N×nU indicates that a picture division line is shifted up relative to a vertical bisector of the picture block, 2N×nU indicates that the picture division line is shifted up by n relative to the vertical bisector of the picture block, in which, n=x*N, where x is greater than or equal to 0 and less than or equal to 1; in two sub-picture blocks into which 2N×nD (in which n=0.5 N) shown in FIG. 2b is divided, lengths of two sides of the upper sub-picture block is 2N and 0.5N, and lengths of two sides of the lower sub-picture block are 2N and 1.5N, in which, D in 2N×nD indicates that a picture division line is shifted down relative to a vertical bisector of the picture block, 2N×nD indicates that the picture division line is shifted down by n relative to the vertical bisector of the picture block, in which, n=x*N, where x is greater than or equal to 0 and less than or equal to 1; in division manners shown in FIG. 2c and FIG. 2d, a picture block is divided into a left rectangular sub-picture block and a right rectangular sub-picture block with unequal sizes; in two sub-picture blocks into which nL×2N (in which n=0.5 N) shown in FIG. 2c is divided, lengths of two sides of the left sub-picture block are 0.5N and 2N, in which, L in nL×2N indicates that a picture division line is shifted left relative to a vertical bisector of the picture block, nL×2N indicates that the picture division line is shifted left by n relative to the vertical bisector of the picture block, in which, n=x*N, where x is greater than or equal to 0 and less than or equal to 1; lengths of two sides of the right sub-picture block are 1.5N and 2N; in two sub-picture blocks into which nR×2N (in which n=0.5 N) shown in FIG. 2d is divided, lengths of two sides of the left sub-picture block is 1.5N and 2N, and lengths of two sides of the right sub-picture block are 0.5N and 2N, in which, R in nR×2N indicates that a picture division line is shifted right relative to a vertical bisector of the picture block, nR×2N indicates that the picture division line is shifted right by n relative to the vertical bisector of the picture block, in which, n=x*N, where x is greater than or equal to 0 and less than or equal to 1.
In existing video encoding and decoding technologies, a transform matrix may be used to remove a correlation between residuals of a picture block, that is, remove redundant information of the picture block, so as to improve the encoding efficiency, and two-dimensional transform is generally adopted to transform a data block in the picture block, that is, at an encoding end, residual information of the data block is separately multiplied by an N×M transform matrix and a transposed matrix thereof, to obtain a transform coefficient after multiplication. The foregoing step may be described by using the following formula:f=T′×C×T 
where, C indicates residual information of a data block, T and T′ indicate a transform matrix and a transposed matrix of the transform matrix, and f indicates a transform coefficient matrix obtained after the residual information of the data block is transformed. The transform matrix may be a discrete cosine transform (Discrete Cosine Transform, DCT) matrix, an integer transform (Integer Transform) matrix, or a KL transform (Karhunen Lòeve Transform, KLT) matrix. The KLT may better take a picture block or texture information of a residual of the picture block into account, and therefore the use of the KLT may achieve good effect.
Performing the foregoing processing on the residual information of the picture block is equivalent to converting the residual information of the picture block from a spatial domain into a frequency domain, and transform coefficient matrixes f obtained after the processing are focused on a low frequency area; after performing the foregoing transform on the residual information of the picture block, the encoding end performs processing such as quantization and entropy encoding on the transform coefficient matrix obtained after the transform, and then sends a bit stream obtained through the entropy encoding to the decoding end. In order to enable the decoding end to know the type and the size of the transform matrix adopted by the encoding end, generally, the encoding end sends indication information indicating the transform matrix used by the current picture block to the decoding end.
Subsequently, the decoding end determines the transform matrix adopted by the encoding end according to the indication information, and decodes the bit stream sent by the encoding end according to features of the transform matrix (such as, orthogonality of the transform matrix) to obtain the transform coefficient matrix, the transform coefficient matrix is multiplied by a transform matrix and a transposed matrix thereof, and residual information of a data block approximately consistent with that of the encoding end may be restored and obtained. The foregoing step may be described by using the following formula:C=T×f×T′
where, C indicates residual information of a data block, T and T′ indicate a transform matrix and a transposed matrix of the transform matrix, and f indicates a transform coefficient matrix obtained by the decoding end.
Because different distribution laws may exist for residuals of a picture block, and the use of a transform matrix with a particular size usually cannot achieve good transform effect, the prior art attempts to use transform matrixes (also referred to as transform blocks) with different sizes for the residuals of the picture block; therefore, for a 2N×2N picture block, a transform matrix with the size of 2N×2N may be used, and a transform matrix with the size of N×N or a transform matrix with the size of 0.5N×0.5N may be used. In order to effectively indicate how a picture block uses transform matrixes with different sizes, a tree form identifying method may be used. As shown in FIG. 3, when the transform size used by a picture block is identified, a first layer in a code stream has an indicator bit used for identifying whether the picture block uses a transform matrix with the size of 2N×2N, and if the picture block uses the transform matrix with the size of 2N×2N (as shown in FIG. 3a), the indicator bit is 0; if the 2N×2N transform is not used fro the picture block, the indicator bit is 1, which indicates that the transform matrix with the size of 2N×2N needs to be further divided into four transform matrixes with the size of N×N, and in a second layer structure of the code stream, four bits are used for separately identifying whether each transform matrix with the size of N×N is further divided; if the picture block uses the transform structure shown in FIG. 3b, the four bits are all 0, which indicates that each transform matrix with the size of N×N is not further divided anymore; when the transform structure shown in FIG. 3c is selected, in the four bits, two bits are 0, and two bits are 1, that two bits are 0 indicates that left lower and right upper transform matrixes with the size of N×N are not divided anymore; that two bits are 1 indicates that left upper and right lower transform matrixes with the size of N×N need to be further divided, to obtain transform matrixes with the size of 0.5N×0.5; and then in a third layer structure of the code stream, four bits are used to indicate whether a left upper transform matrix with the size of 0.5N×0.5N needs to be further divided, four bits are used to indicate whether a right lower transform matrix with the size of 0.5N×0.5N needs to be further divided, and if the picture block uses the transform structure shown in FIG. 3c, the foregoing four plus four bits are all 0, which indicates that division is not further performed anymore. Through the foregoing layer-wise identification in the code stream, the transform size used by the picture block and the sub-picture block may be effectively and flexibly indicated.
In the method of using layer-wise identification in the prior art, the size of a transform matrix is not associated with the size of a prediction block. As shown in FIG. 4a), when a 2N×2N picture block uses asymmetrical division (a division line is shown in the drawing as a bold solid line), if the current picture block uses a transform matrix with the size of 2N×2, the transform matrix crosses a boundary of a prediction block; if the current picture block uses four transform matrixes with the size of N×N, the transform matrixes still cross the boundary of the prediction block; if transform matrixes with the size of N×N are adopted on the left lower side and the right upper side of the current picture block, and transform matrixes with the size of 0.5N×0.5N are adopted on the left upper side and the right lower side of the current picture block, the transform matrix with the size of N×N on the left lower side of the current picture block still crosses the boundary of the prediction block.
The prior art has the following disadvantages:
In the prior art, the size of a transform matrix is not associated with the size of a prediction block, so that the transform matrix crosses the boundary of the prediction block. Because skipping transform exists in residual data corresponding to boundaries of two prediction blocks, if a transform matrix crosses the boundaries of two prediction blocks, the transform action is alleviated, the correlation between residuals of a picture block cannot be effectively removed, and the redundant information of the picture block cannot be effectively removed, thereby reducing the encoding efficiency.