At present, high definition video has become a hot spot in market. Compared with standard definition, advantages of high definition images are obvious. However, the problem that follows is that consumption of processing resources brought by the high definition images increases exponentially compared with the standard definition.
In addition, the image encoding ability of the currently strongest single chip core processor is very limited. When the processing ability of the single chip core processors is insufficient to finish tasks of the high definition images independently, a lot of manufacturers use multi-core image encoding devices to complete high definition image encoding. A plurality of digital signal processing (DSP) chips are included in a multi-core image encoding device and one image frame is divided into a plurality of portions, each DSP encoding one of the portions respectively.
For the H.264 protocol, a process of encoding a image frame mainly comprises prediction (including intra-frame prediction and inter-frame prediction), discrete cosine transform (DCT), quantification, entropy coding, inverse quantification, inverse DCT and loop filtering. The loop filtered image is used as a prediction image of a subsequent image. All these, except the loop filtering, are performed inside slices which are independent of each other. For example, a reference value required for the intra-frame prediction will not exceed boundaries of the slices, thus each slice can perform the intra prediction independently. When the image is divided into two halves, two DSP chips can process them independently, because the required data is inside the current slice, that is, there is no second DSP which needs to wait until the first chip has finished the processing and then begin the processing. One slice includes one or more macro blocks.
However, a loop filtering process defined in the H.264 is an exception, where filtering is performed on the whole frame and strictly according to the grating scan order of macro blocks. A necessary condition to perform filtering for a certain specific macro block is that filtering for a left macro block and an upper macro block adjacent to the current block has been completed. As shown in FIG. 1, loop filtering of the macro blocks is performed in the up-to-down and left-to-right order.
Loop filtering defined in the H.264 is performed by taking a macro block as a unit. Inside a macro block, filtering for vertical boundaries is performed first, and then filtering for horizontal boundaries is performed. The filtering for the vertical boundaries is performed in the left-to-right order, and the filtering for the horizontal boundaries is performed in the up-to-down order. For example, as shown in FIG. 2, one macro block includes four vertical boundaries, a, b, c, and d, and four horizontal boundaries, e, f, g, h. When loop filtering is performed on the macro block, the filtering for the vertical boundaries is performed in the order of a-to-b-to-c-to-d, and the filtering for the horizontal boundaries is performed in the order of e-to-f-to-g-to-h.
When a dual chip core processor is used for encoding, in order to achieve load balance of the two DSP chips, generally an image frame to be encoded is horizontally divided into the upper half and lower half. Because horizontal movement is typically more violent than vertical movement of video, dividing the image horizontally into the upper half and lower half can better comply with this characteristic of the image such that the divided image has good visual effect. After the image frame is divided into the upper half and lower half, one DSP chip processes half of the image, for example, DSP0 processes the upper half of the image, and DSP1 processes the lower half of the image. According to the H.264 protocol, upon the filtering for the image, DSP1 cannot perform the filtering independently, and it needs to wait until DSP0 has completed the filtering for the upper half of the image and then begin to perform the filtering on the lower half of the image which the DSP1 is in charge of, such that filtering can be performed in series between the DSPs, which is a great waste for valuable DSP resources.