Multiview video coding is essential for applications such as three dimensional television (3DTV), free viewpoint television (FTV), and multi-camera surveillance. Multiview video coding is also known as dynamic light field compression. As used herein, coding can include encoding, decoding or both in a codec, for example.
Depth images are assumed part of the data format in an emerging 3D video coding standard. Using the depth images as side information to perform predictive coding is known as view synthesis prediction (VSP).
In conventional video coding, e.g., coding according to the H.264 AVC (Advanced Video Coding) and H.265 HEVC (High Efficiency Video Coding) standards, motion information from neighboring blocks is used to derive a motion vector. The derived motion vector is then used as a motion vector predictor (MVP) to predict the motion vector for the current block. Then, the motion vector difference (MVD) between the current motion vector and the MVP is encoded and transmitted.
FIG. 1 shows a conventional method to code a current block. Step 110 derives a motion or disparity vector from a neighboring block, referred to as MotionDerive. Step 120 determines a motion or disparity vector, referred to as MotionCurrent, for the current block by applying motion estimation techniques that aim to minimize residual differences. Step 130 calculates and codes the motion vector difference: (MotionDiff=MotionCurrent−MotionDerive). Finally, step 140 codes the residual block.
FIG. 2 shows the corresponding prior-art encoder. Element 201 shows blocks in a portion of a picture. In element 201, a current block is denoted by a star “*”, and a neighboring block is denoted by a dot “•”. From the neighboring blocks as shown in element 201, derive 202 a motion vector or disparity vector. The derived motion or disparity vector from 202 serves as motion vector predictor (MVP) 203.
By referencing the texture reference picture buffer 204, motion estimation is performed 205 for the current block to produce a motion vector (MotionCurrent) 206 for the current block.
After calculating 207 the difference between MVP and MotionCurrent, a motion vector difference (MVD) 208 is obtained, which is encoded 209 into the bitstream 210.
Another output from motion estimation 205 is the reference picture, which serves as texture predictor 211. Then, the texture residual 211 is obtained by performing 212 texture prediction based on the texture predictor 211 and the current picture 215. The texture residual 213 is also encoded 214 as part of the bitstream.
FIG. 3 shows the decoder. From the neighboring blocks as shown in element 301, derive 302 a motion vector or disparity vector. The derived motion or disparity vector serves as the motion vector predictor (MVP) 303.
From the coded bitstream 310, motion vector difference (MVD) 308 is decoded 309 and fed to an adder 307. The motion vector predictor 303 and motion vector difference 308 are added 307, and the motion vector used for the current block MotionCurrent 306 is then obtained.
From the coded bitstream 310, the texture residual picture 313 is decoded 314. The current motion vector 306 and the texture residual picture are inputs to the motion compensation module 305. Together with texture reference buffer 304, the motion compensation is performed, and finally the decoded picture is outputted 315.