1. Field of the Invention
The present invention relates to predictive encoding of video data, and more particularly, to a method of and an apparatus for predicting a direct current (DC) coefficient of a video data unit.
2. Description of the Related Art
Since video data contains a large amount of data, compression encoding is essential for storage or transmission of video data. Encoding or decoding of video data is performed in data units such as macroblocks of 16×16 pixels or blocks of 8×8 pixels. For encoding or decoding of video data in predetermined data units, data units included in one picture should be scanned.
FIG. 1 is a view for explaining conventional raster scan. Raster scan is carried out in such a way that data units included in a picture are scanned left-to-right and up-to-down. Raster scan begins with a data unit at the top left corner of the picture.
As one of video data compression methods, there is intra spatial predictive encoding. Intra spatial predictive encoding is a technique for compressing video data using similarities among data in one picture. More specifically, after a pixel value of a current data unit to be encoded is predicted using at least one pixel value of at least one previous data unit that has a correlation with the current data unit, a difference between an actual pixel value of the current data unit and the predicted pixel value of the current data unit is entropy coded and then transmitted. Through intra spatial predictive encoding, the efficiency of data compression can be improved when the actual pixel value is entropy coded and then transmitted.
FIG. 2 shows an example of previous data units used for intra spatial predictive encoding of a current data unit according to prior art. Referring to FIG. 2, previous data units A, B, C, and D are used for intra spatial predictive encoding of a current data unit E. According to conventional raster scan, data units included in one picture are scanned left-to-right and up-to-down. Thus, according to conventional scan, the data units A, B, C, and D are already scanned and encoded prior to the current data unit E. Since data units marked with X are not encoded prior to the current data unit E, they cannot be used for predictive encoding of the current data unit E. Since data units marked with O usually have low correlations with the current data unit E, they are not used for predictive encoding of the current data unit E. Previous data units are already encoded or already encoded and then restored through decoding.
Intra predictive encoding employed in MPEG-4 Part 2 uses a discrete cosine transform (DCT) coefficient. As shown in FIG. 2, if the data unit E is a current data unit to be intra spatial predictive encoded, according to MPEG-4 Part 2, the previous data units A, B, and D are used for intra spatial predictive encoding of the current data unit E. The previous data units A, B, and D and the current data units E are macroblocks of a 16×16 size.
In the case of MPEG-4 Part 2, a DC coefficient of the current data unit E is predicted in an area that is DCT transformed in 8×8 block units, using differences among DC coefficients of the previous data units A, B, and D.
FIG. 3 is a view for explaining intra predictive encoding in MPEG-4 Part 2. Referring to FIG. 3, the previous data units A, B, and D and the current data unit E that are macroblocks of a 16×16 size are predictive encoded in units of a 8×8 block. In other words, the previous data unit A is divided into A1 through A4, the previous data unit B is divided into B1 through B4, the previous data unit D is divided into D1 through D4, and the current data unit E is divided into E1 through E4.
Intra prediction of the current data unit E is performed as follows. First, to perform intra prediction of the current data unit E, it is determined whether the previous data units A, B, and D exist. If one of the previous data units A, B, and D is located in a different video object plane (VOP), a predicted value of a DC coefficient of the current data unit E is determined to be, for example, 128. A VOP is a kind of video unit for video coding and, according to MPEG-4 Part 2, one image frame is divided into a plurality of VOPs and is encoded or decoded in units of a VOP.
If the previous data units A, B, and D and the current data unit E are all located in the same VOP, it is determined whether blocks D4, B3, and A2 exist for processing a block E1 among four 8×8 blocks included in the current data unit E. In cases where any one of the blocks D4, B3, and A2 does not exist or is not intra coded, a predicted value of the DC coefficient of the block E1 is determined to be 128.
Thereafter, in another cases except for the above two cases, an intra predicted value of the DC coefficient of the block E1 is determined as follows. In other words, when a difference between a DC coefficient of the block A2 and a DC coefficient of the block D4 is less than a difference between a DC coefficient of the block D4 and a DC coefficient of the block B3, there is a high probability that the DC coefficient of the block E1 is similar to the DC coefficient of the block B3. Thus, the predicted value of the DC coefficient of the block E1 is determined to be the DC coefficient of the block B3. In the contrary case, the predicted value of the DC coefficient of the block E1 is determined to be the DC coefficient of the block A2.
Since the prediction method described above can be performed in the same manner in an encoder and a decoder, it has the advantage of not requiring the encoder to transmit a parameter for a predicted value of a DC coefficient. In other words, also in the decoder, a predicted value of a DC coefficient can be obtained in the same manner as in the encoder.
The above-described procedure is repeated for prediction of a DC coefficient of a block E2 using the blocks E1, B3, and B4, for prediction of a DC coefficient of a block E3 using the blocks A2, A4, and E1, and for prediction of a DC coefficient of a block E4 using the blocks E1, E2, and E3.
A new video data scan scheme that is different from the above-described raster scan has been developed. Korean Patent Publication No. 2002-5365 titled “Apparatus and Method for Water Ring Scan and Apparatus and Method for Video Coding/Decoding Using the Same” discloses a scan method called a water ring scan method.
FIG. 4 shows a water ring scan method. A picture shown in FIG. 4 includes a plurality of data units. The water ring scan method starts from a predetermined location of a picture, e.g., a data unit in the center of the picture, towards data units surrounding the scanned data unit, with clockwise or counterclockwise rotation. When data units are scanned according to the water ring scan method, scanning takes a form of water rings in which a plurality of water rings surrounds a data unit as a water ring origin point.
Referring to FIG. 4, the data unit as the water ring origin point is indicated by 0 and a plurality of water rings surrounds the data unit indicated by 0. Data units forming a first water ring 11 are indicated by 1, data units forming a second water ring 13 are indicated by 2, and data units forming a third water ring 15, a fourth water ring 17, and a fifth water ring 19 are indicated by numbers, respectively, in the same manner. Each water ring takes the form of a square ring.
A recently established new video compression coding standard MPEG-4 Part 10 AVC (advanced video coding) or ITU-T H.264 was developed to deal with transition from conventional circuit switching to packet switching service and various communication infrastructures, as new communication channels such as mobile communication networks are rapidly distributed. AVC/H.264 improves the encoding efficiency by 50% or more in comparison to existing standards MPEG-4 Part 2 visual codec and considers error robustness and network friendliness to cope with the rapidly changing wireless environment and Internet environment.
In particular, to actively respond to a transmission error in a wireless transmission environment or a packet-based transmission environment like Internet, MPEG-4 Part 10 AVC newly employs video data scan called flexible macroblock ordering (FMO). In FMO, there are seven modes and three modes among them are called box-out scanning. Box-out scanning is an example of the water ring scan method described above. In the case of box-out scanning, a picture is divided into a region of user's interest and a background region and the two regions are encoded and decoded in different manners.
FIG. 5 shows a picture that is divided into a region of interest (ROI) 21 and a left-over region 23. In one picture, a region of interest is generally a region around the center of the picture. Thus, a region within a predetermined range from the center of the picture is determined to be the ROI 21 and the remaining region is determined to be the left-over region 23. To encode and decode the ROI 21 independently of the left-over region 23, the left-over region 23 cannot be used for spatial predictive coding of the ROI 21.
FIG. 6A shows box-out scanning in which data units are scanned clockwise, and FIG. 6B shows box-out scanning in which data units are scanned counterclockwise.
Box-out scanning is one of methodologies for encoding an ROI and improves the compression efficiency considering human visual characteristics or enables improved protection from errors. More specifically, during encoding, box-out scanning can offer better protection from errors to an ROI than a left-over region. Since encoding of an ROI is independent of encoding of a left-over region, data of the left-over region can be encoded by reducing its bitrate and computational complexity. In particular, when a gradual random access is performed, a ROI can be only reconstructed in a decoder and an encoder can only transmit an ROI to the decoder.
When a method of scanning data units from the center of a picture towards the remaining region of the picture like the above-described water ring scanning or box-out scanning is called ROI-oriented scanning, conventional intra spatial predictive encoding cannot be applied to video data that is scanned according to ROI-oriented scanning and then encoded or decoded.
FIG. 7 shows reference data units required for prediction of a DC coefficient of a current data unit according to a conventional prediction method when data units are scanned according to clockwise box-out scanning as shown in FIG. 6A. When a data unit C1 is the current data unit to be intra-predicted, previous data units C2, C10, and C11 are required for intra-prediction of the current data unit C1 according to a conventional prediction method.
However, when data units are scanned according to clockwise box-out scanning, since the data units C2, C10, and C11 are to be scanned and encoded after the current data unit C1, they cannot be used for intra-prediction of the current data unit C1.
In other words, when video data is scanned according to ROI-oriented scanning and then encoded, a DC coefficient of a current data unit cannot be predicted based on conventional raster scanning.