Video communication services, e.g., television telephone services and video conferencing services have been realized over high-speed digital networks such as ISDN (Integrated Service Digital Network).
Recently, with the spread of radio transmission networks represented by PHS (Personal Handy phone System), the progress of data modulation/demodulation techniques for PSTN (Public Switched Telephone Network) and the advance of image compressing techniques, there have been increasing demands for video communication services over lower bit-rate networks.
As well known, H.261 and H.263 are internationally established standard coding methods for compressing video information. These standardized video-coding methods adopt a hybrid video coding method performing interframe-prediction coding in combination with intraframe-prediction coding.
The interframe-prediction coding is to generate a predictive video-frame from a reference video-frame and encode a difference of the predictive frame from a current video-frame to reduce the number of codes to be transmitted. This enables effective use of transmission lines.
The interframe-prediction coding is made by using any one of methods of block-matching, affine transforming, warp prediction and so on. A conventional interframe-prediction method using affine transformation is explained as follows:
The affine transformation itself is first described. The affine transformation is obtained by mapping from a video-frame to another video-frame by using 6 parameters representing a map. The affine transformation is usually conducted on a triangle area for simplifying calculation of affine parameters.
To explain a method of interframe-prediction by using affine transformation in the case of forward predicting, motion vectors of control grid points A, B, C and D on the current video-frame are detected at corresponding control grid points A', B', C' and D' on a reference video-frame.
Three of four control grid points are first selected and an area is divided to determine affine parameters. For example, an area on a current video-frame is divided into two triangles ABC and BCD therein and a corresponding area on the reference video-frame is divided into two angles A'B'C' and B'C'D' therein.
On the triangles into which the area is divided, affine parameters are determined from vertex positions of each triangle (vertex positions of one triangle and motion vectors of the other triangle may be used).
A predictive video-frame is generated by mapping thereto all pixels of all triangular areas according to the obtained affine parameters.
A method of dividing an image into adaptive areas is as follows:
First, basic motion-vectors is searched on control grid points of a square area consisting of 16 by 16 pixels. Additional motion-vectors are also searched at control grid points of a square area consisting of 8 by 8 pixels.
There are area-dividing patterns in case of affine transformation by using the basic motion-vectors or additional motion-vectors. In one example, simplified area-dividing patterns may be used. In the other example, there are area-dividing patterns including a pattern for translation. Type 1 is a pattern for translation, type 2 and 3 are two-divisional (bisectional) patterns (for two times of affine-transformations), types 4 to 7 are five-divisional (pentasectional) patterns (for five times of affine-transformations) and types 8 and 9 are eight-divisional (octasectional) patterns (for eight times of affine-transformations). Among these types, suitable one may be selected to use.
To explain an example (A) of prior art motion-compensated interframe-prediction flow. At 1st step, motion-vectors are obtained from an input video-frame and a reference video-frame. At 2nd step, affine transformation is conducted for respective types of area-dividing patterns for each block. At 3rd step, prediction-error estimation values for translated area (type 1) and affine-transformed areas (types 2 to 9) are calculated and the area-dividing pattern type having a minimal estimated value of prediction error is determined. At 4th step, side information (e.g., the determined type of area-dividing pattern and motion vectors) is encoded.
To explain another example (B) of prior art motion-compensated interframe prediction flow. At 1st step, motion vectors are searched. At 2nd step, affine transformation is performed for respective two-division types 2 and 3 (two times for each type). At 3rd step, prediction-error estimation values for translated area (type 1) and affine-transformed areas (types 2 and 3) are calculated and one of the three types of area-dividing patterns, which has a minimal estimation value of prediction error, is determined. At 4th step, the minimal estimation value of the type is compared with a preset threshold value T and the type is accepted to use if its estimated value is smaller than the threshold value T. At 5th step, side information is encoded. The process returns to 1st step if the estimation value larger than the threshold T.
At 1st step, motion-vectors are searched. At 2nd step, affine transformation is performed for respective five-division types 4 to 7 (five times for each type). At 3rd step, prediction-error estimation values of the affine-transformed areas (types 4 to 7) are calculated and one of the types of area-dividing patterns, which has a minimal estimation value of prediction error, is determined. At 4th step, the minimal estimation value of the type is compared with a preset threshold value T and the type is accepted to use if its estimation value is smaller than the threshold T. At 5th step, side information is encoded. The process returns to 1st step if the estimation value larger than the threshold T.
At 1st step, motion-vectors are used. At 2nd step, affine transformation is performed for respective eight-division types 8 and 9 (eight times for each type). At 3rd step, prediction-error estimation values for affine-transformed areas (types 8 and 9) are calculated and either one of the types, which has a smaller estimation value of prediction error, is determined. The type thus accepted to use is encoded as side information at 5th step.
The prior art (A) conducts affine transformation of all types to determine an adapted area-dividing pattern. Therefore, it must conduct many times of affine transformation by performing a large amount of calculation.
The prior art (B) performs affine transformation fewer times than the prior art (A) does. However, this art always first conducts affine transformation of two-division (bisectional) areas, which may be of no use if translated type is selected. In case of conducting affine transformation and determining a prediction-error estimation value, types 2 and 4 are the same and affine transformation of the latter is unnecessary in practice.