Picture prediction refers to the case where, by means of information such as motion vectors and reference frame indexes, a part of a picture is copied directly or acquired by means of methods such as sub-pixel interpolation from the picture which has been encoded/decoded to serve as a predicted picture of the current encoded/decoded picture block in the video encoding/decoding process. At an encoding end, a picture residual can be obtained by subtracting the predicted picture and the original picture and the picture residual is encoded and written into a code stream; and at a decoding end, a reconstructed picture can be obtained by adding the predicted picture and the picture residual obtained through decoding from the code stream. Encoding/decoding performed in this way of predicting a picture can effectively improve the efficiency of encoding/decoding.
At present, a common method for generating a predicted picture is a method in which a forward projection is used, i.e. it means that depth information of each pixel or each block of pixels in a reference view is utilized to project a picture of the reference view to a target view so as to obtain a picture of the target view. Currently, a common block-based forward projection is generally implemented in the following way:
In order to generate one target block of pixels Bt in the target view, the size and location of a reference block of pixels Br for generating Bt in the reference view are determined by the size and location of Bt and a depth value corresponding to each pixel in Bt, that is, the reference block of pixels needs to be determined by means of depth information of the target view in the method.
In the above-mentioned method, a set of new boundary values are obtained by subtracting Dmax from the left boundary of Bt, subtracting Dmin from the right boundary of Bt, and keeping the upper and lower boundaries unchanged, where Dmax and Dmin are respectively the statistical maximum value and minimum value of disparity values derived by converting the depth values corresponding to the pixels in Bt. The new boundary values are set as Br boundary values. The Br is projected to the target view, and pixels in Bt without projected ones from Br are filled to obtain the final projected Bt. Bt is taken as a predicted picture. Depth information of the reference view is employed in the process of projection to generate a desired predicted picture.
The above-mentioned hole filling means that holes will appear after a virtual view projection is performed on places with depth changes, such as boundary regions of objects in the picture. After the picture of the projected view is projected to the target view, a single point or successive point section in the target view without projected pixels from the reference view is called a hole. The appearance of holes is related to the sampling rate of depth and texture, as well as the occultation relationship among objects in a three-dimensional scene. Scenes located behind objects cannot be acquired by a camera, and thus the scene originally located behind the objects cannot be constructed using projection. When holes appear, hole filling techniques need to be used to fill the holes in the projected picture so as to make it complete. The hole filling method is generally to fill the entire empty hole with a pixel value of two projected pixels belonging to a background region at two sides of the empty hole (based on a relationship between a relative foreground and background of the two pixels). If only one side of the hole has a projected pixel (which generally appears on the picture boundaries), this projected pixel is used to fill the entire hole. The above-mentioned method is only one of the empty hole filling techniques, and many other methods can also be used for implementation.
It can be seen from the above-mentioned description that, in the prior art, when performing forward projection, depth information of a reference view is used to determine the reference block of pixels, and depth information of a target view is used for projection. Therefore, the depth information of the target view and the depth information of the reference view need to be used simultaneously in the predicted picture generation process, which brings relative large dependence among data.
Aiming at the above-mentioned problem, no effective solution has been presented.