H.264 and Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) have been standardized as methods for encoding moving image data, and brought to attention. Because the H.264 and the MPEG-4 Part 10 AVC are technologically the same, they will be referred to as H.264/AVC in the following.
The H.264/AVC has five encoding modes including an intra prediction, a forward prediction, a backward prediction, a bidirectional prediction, and a direct mode, as encoding modes for a bi-directional predictive picture (B-picture). In particular, the direct mode is a newly added mode, and is a method to determine a motion vector of a current macroblock from motion vectors of temporally or spatially adjacent macroblocks, by focusing on the continuity of the moving image data.
The principle of a temporal direct mode in the direct modes will now be described with reference to FIG. 9. FIG. 9 is a schematic of a direct vector (frame structure). The temporal direct mode is simply referred to as direct mode.
In the direct mode, a motion vector of a macroblock included in a picture processed immediately before and placed at the same position as the current macroblock is selected as a reference vector, and a motion vector of the current macroblock (hereinafter, referred to as direct vector) is determined by temporally scaling the selected reference vector. In a normal encoding order of moving image data, a certain B-picture is processed subsequent to a reference picture in the forward direction (in the past direction temporally) and a reference picture in the backward direction (in the future direction temporally). Accordingly, a picture processed immediately before the certain B picture is a reference picture in the future direction temporally.
Generally, a reference picture in the past direction is called List0, and a reference picture in the future direction is called List1. In the direct mode, as depicted in FIG. 9, a motion vector of a macroblock (refPicCol is a picture to be referenced) placed at the same position as the reference picture in the future direction (colPic: a picture of Ref_idx=0 of List1) is selected as a reference vector (mvCol). By temporally scaling (at a spacing ratio of picture order count (POC)) the selected reference vector, a direct vector in the forward direction (mvL0) and a direct vector in the backward direction (mvL1) of the macroblock on the current B picture (CurrPic) are determined.
More specifically, the direct mode is performed based on the following principle. Assume that a macroblock placed at the same position as that on the reference picture in the future direction (colPic) has predicted and indicated a region on the reference picture in the past direction, by a reference vector (mvCol). In this case, it is likely to consider that a certain object included in the reference picture in the future direction (colPic) moved from the reference picture in the past direction (refPicCol) along a vector, which is the reference vector (mvCol), in space time. If this is so, the object also passes through the current picture (CurrPic) interposed between the reference picture in the future direction (colPic) and the reference picture in the past direction (refPicCol), along the reference vector (mvCol). Based on the assumption, in the direct mode, the macroblock on the current picture (CurrPic) is predicted from the reference picture in the future direction (colPic) and the reference picture in the past direction (refPicCol), by using a vector parallel to the reference vector (mvCol) in space time. Formulae for calculating a vector parallel to the reference vector (mvCol) that indicates from the reference picture in the future direction (colPic) to the reference picture in the past direction (refPicCol) in space time, between CurrPic and refPicCol, and between CurrPic and colPic are depicted below.
Formulae for calculating a normal direct vector are as follows: mvL0 is a direct vector from the current picture (CurrPic) to the reference picture in the past direction (refPicCol), and mvL1 is a direct vector from the current picture (CurrPic) to the reference picture in the future direction (colPic).mvL0=mvCol×tb/td  (1)mvL1=mvL0−mvCol  (2)where td is the time distance from the reference picture in the future direction (colPic) to the reference picture in the past direction (refPicCol), and tb is the time distance from the current picture (CurrPic) to the reference picture in the past direction (refPicCol). The direct vectors (mvL0 and mvL1) determined here are calculated on assumption that the picture has a frame structure.
The vector in the forward direction and the vector in the backward direction are used as examples for conveniently explaining the direct vector. However, the vectors mvL0 and mvL1 are not fixed to either of the forward direction or the backward direction, in the H.264 and MPEG-4 Part 10 AVC. Accordingly, the similar calculation can be carried out by using a vector of the combination of the forward direction/forward direction or the backward direction/backward direction. In the following description on the direct vector, the vectors mvL0 and mvL1 used for calculation are referred to as a first vector (direct vector) and a second vector (direct vector).
For example, methods of switching coefficients depending on the time distance, if a pixel with opposite parity is referred to, while a picture having a field structure is encoded in the direct mode, have been disclosed.
In the technologies disclosed in, for example, Japanese Laid-open Patent Publication No. 2004-048632 and Published Japanese Translation of PCT Application No. 2005-510984.
In the conventional technologies, the encoding efficiency is decreased, when a picture having a field structure is encoded in the direct mode. In other words, in the conventional technologies, if a pixel with opposite parity is referred to, while a picture having a field structure is encoded in the direct mode, an error corresponding to the difference in the parities occurs in the vector. Accordingly, the encoding efficiency is decreased.
FIG. 10 is a schematic of a direct vector (field structure). As depicted in FIG. 10, in a picture having a field structure, pixels in the Bottom_field are shifted in the downward direction by 0.5 pixel across the field, compared with the pixels in the Top_field. Accordingly, with a vector obtained by referring to a pixel with opposite parity, while a picture having a field structure is encoded in the direct mode, the direct vectors mvL0 and mvL1 do not become parallel to the reference vector (mvCol) in space time. Such a direct vector is not the maximum likelihood, thereby decreasing the encoding efficiency.