1. Field of the Invention
The present invention relates to spatial predictive encoding and decoding, and more particularly, to a method and apparatus for spatial predictive encoding and/or decoding of video data, which can more accurately perform spatial predictive encoding and/or decoding of video data.
2. Description of the Related Art
Since video data contains a large amount of data, compression encoding is essential for storage or transmission of video data. Encoding or decoding of video data is performed in data units such as macroblocks of 16×16 pixels or blocks of 8×8 pixels.
New video compression encoding standard called Motion Picture Experts Group (MPEG-4) Part 10 advanced video coding (AVC) or International Telecommunication Union Telecommunication Standardization Sector (ITU-T) H.264 has been established. In particular, AVC is developed to respond to transition from conventional circuit switching to packet switching service and various communication infrastructures, as new communication channels such as mobile communication networks are rapidly distributed.
AVC improves the encoding efficiency by 50% or more in comparison to existing standard MPEG-4 Part 2 visual codec and considers error robustness and network friendliness to cope with the rapidly changing wireless environment and Internet environment.
Intra spatial predictive encoding is a technique for compressing data of a current data unit using spatial correlation of video. More specifically, after pixel values of a current data unit are predicted using pixel values of at least one previous data unit that has correlation with the current data unit, a difference between actual pixel values of the current data unit and the predicted pixel values is entropy coded and then transmitted. Therefore, by performing intra spatial predictive encoding, the efficiency of data compression is improved when the actual pixel values are entropy coded and then transmitted.
FIG. 1 shows previous data units for intra spatial predictive encoding of a current data unit according to prior art. Referring to FIG. 1, for intra spatial predictive encoding of a current data unit E, previous data units A, B, C, and D are used. According to a conventional raster scan scheme, data units included in one image are scanned left-to-right and up-to-down. Thus, according to a conventional raster scan scheme, the previous data units A, B, C, and D are already scanned and encoded before the current data unit E. Since data units marked with X are not encoded, they cannot be used for predictive encoding of the current data unit E. Since data units marked with O have low correlation with the current data unit E, they are not used for predictive encoding of the current data unit E. After the previous data units are discrete cosine transformed (DCT) and quantized, they are inversely quantized and inversely DCT and are then reproduced.
According to AVC standard, Intra spatial predictive encoding is divided into intra 4×4 mode predictive encoding and intra 16×16 mode predictive encoding. In intra 4×4 mode predictive encoding, predictive encoding is performed in 4×4 subblock units. In intra 16×16 mode predictive encoding, predictive encoding is performed in 16×16 macroblock units.
Intra 16×16 mode predictive encoding will be described in more detail. Referring back to FIG. 1, when the data unit E is a current data unit to be coded, the previous data units A and B are used as reference data units for intra 16×16 mode predictive encoding. Also, all the pixel values of the previous data units A and B are not used for predictive encoding, but, as shown in FIG. 2, 16 pixel values V0 through V15 of pixels included in the right-most line of the previous data unit A and 16 pixel values H0 through H15 of pixels included in the bottom-most line of the previous data unit B are used for predictive encoding.
FIGS. 3A through 3D show four 16×16 intra predictive encoding modes according to MPEG-4 AVC. FIG. 3A shows a mode #0 called a vertical mode. When each actual pixel value of the current data unit E is P[x,y](x=0 . . . 15, y=0 . . . 15) and each predicted pixel value of the current data unit E is P[x,y](x=0 . . . 15, y=0 . . . 15), the predicted pixel value P[x,y] is determined using pixel values H0 through H15 of pixels included in the bottom-most line of the previous data unit B. In other words, as shown in FIG. 3A, P[x,y](x=0 . . . 15, y=0 . . . 15) is the same as one of H0 through H15 of pixels included in the same vertical line. For example, predicted pixel values included in the first vertical line of the current data unit are all H0 and predicted pixel values included in the second vertical line of the current data unit are all H1.
FIG. 3B shows a mode #1 called a horizontal mode. As shown in FIG. 3B, P[x,y](x=0 . . . 15, y=0 . . . 15) is the same as one of V0 through V15 of pixels included in the same horizontal line. For example, predicted pixel values included in the first horizontal line of the current data unit are all V0 and predicted pixel values included in the second horizontal line of the current data unit are all V1.
FIG. 3C shows a mode #2 called a DC mode. As shown in FIG. 3C, P[x,y](x=0 . . . 15, y=0 . . . 15) is a mean value of H0 through H15 and V0 through V15. If there exists the previous data unit A, but the previous data unit B does not exist, P[x,y](x=0 . . . 15, y=0 . . . 15) is a mean value of V0 through V15. If the previous data unit A does not exist, but the previous data unit B exists, P[x,y](x=0 . . . 15, y=0 . . . 15) is a mean value of H0 through H15. If neither the previous data unit A nor the previous data unit B exists, P[x,y](x=0 . . . 15, y=0 . . . 15) is set to a predetermined value like 128.
FIG. 3D shows a mode #3 called a plane mode. Referring to FIG. 3D, P[x,y](x=0 . . . 15, y=0 . . . 15) is determined in which predicted pixel values located on the left side of the diagonal line are determined using V0 through V15 and predicted pixel values located on the right side of the diagonal line are determined using H0 through H15. The mode #3 is useful for spatial prediction of video that gradually changes.
Video encoders that comply with AVC standards predictive encode a current macroblock in a plurality of modes of the intra 4×4 prediction mode and intra 16×16 prediction mode and then decide a prediction mode having the smallest value of a cost function. The cost function indicates the accuracy of predictive encoding and largeness and smallness of the amount of generated bit. As the cost function, there is a sum of absolute difference (SAD), a sum of absolute transformed difference (SATD), a sum of square difference (SSD), a mean of absolute difference (MAD), or a lagrange function.
Once one of the four 16×16 spatial predictive encoding modes is decided as a final prediction mode, the decided prediction mode is expressed with 2 bits and then coded using fixed length coding (FLC) or variable length coding (VLC).
However, the intra 16×16 prediction mode according to conventional AVC standards does not offer a sufficient number of modes for accurate spatial prediction.