A three-dimensional (3D) video sequence comprises multiple (usually two-view) video sequences (corresponding to texture information), corresponding depth sequences (corresponding to depth information), which are usually called MVD (multi-view video plus depth). Each video sequence, which is usually captured by one camera, is called a camera viewpoint video sequence, and the corresponding viewpoint is called camera viewpoint. The three-dimensional video sequence also comprises information regarding the camera parameters of each viewpoint etc. A virtual viewpoint video sequence can be generated by the technique of view synthesis, and the corresponding viewpoint is called virtual viewpoint. Conventional binocular stereoscopic video consists of two fixed viewpoint video sequences (left view and right view), which is also called a stereoscopic pair. Stereoscopic pair captured by a stereo camera may have the problem that the view disparity between two-view images is too large. Viewing these stereoscopic video will cause serious visual fatigue. In other words, these stereoscopic videos are not suitable for stereoscopic viewing. After the introduction of the virtual viewpoint video sequence, a stereos pair for more suitable stereoscopic viewing can be generated by one view of a stereoscopic video and a synthesized virtual view. For different displays (e.g., different resolution or different width), the same N-pixel disparity corresponds to different depth feelings.
A virtual viewpoint video sequence can be generated by view synthesis technology. View synthesis technology uses a depth-image-based rendering (DIBR) method to project pixels in a camera viewpoint image to another virtual viewpoint and generate a projection image, based on depth values of the pixels and the corresponding camera parameters (e.g., focal length and coordinate positions of each viewpoint). Then methods including hole filling, filtering and resampling are used to generate one or more virtual viewpoint video sequences for final displaying. In view synthesis, a virtual viewpoint image can also be generated based on images of multiple camera viewpoints, called view merging. Then, the hole filling, filtering and resampling and other steps are applied on the merged image to generate a virtual viewpoint video sequences for displaying.
A stereos pair comprises two video sequences, the left viewpoint video sequence (for the left eye) and the right viewpoint video sequence (for the right eye). In order to improve the stereoscopic experience when viewing a stereo pair, it is common to adjust the parallax range presented on the display, by shifting stereos pair horizontally. When the left viewpoint image (i.e., image of the left viewpoint video sequence) is shifted to the right with respect to the right view image (i.e., image of the right viewpoint video sequence), the negative parallax increases and the positive parallax decreases. When the left viewpoint image is shifted to the left with respect to the right viewpoint image, the negative parallax decreases and the positive parallax increases. For instance, to shift the left viewpoint image by N pixels to the right direction with respect to right viewpoint image, the typical/common method is copying the pixels from column i to column i+N/2 one by one in the left viewpoint image and copying pixels from column i to column i−N/2 in the right viewpoint image, or keeping the right viewpoint image unchanged and copying pixels from column i to column i+N in the left viewpoint image. Shifting the left viewpoint image by N pixels to the left with respect to the right viewpoint image is similar.
The physical resolution of a display screen, i.e., the physical resolution of the pixels on the display panel, is an intrinsic parameter of the display, which indicates the maximum numbers of pixels that are supported to be displayed in horizontal and vertical directions. A display screen does not have to be operated under the maximum resolution determined by the physical resolution. For example, a display of physical resolution of 1920*1080 can be operated under other resolution such as 1600*900 or 1024*768. Therefore, the working resolution of a display is the screen resolution when the display is operated, but may not be the physical resolution of the display. In applications such as TV and movie, input images are shown in a full-screen mode, and in this case, the actually effective horizontal width (or actual width) or horizontal size is equal to the physical width of the display. If the input image resolution, i.e., the numbers of pixels in horizontal and vertical directions of an image input to the display, is lower than the current working resolution of the display, then the display may usually up-sample the image and scale it to a full-screen size for displaying (often called extended display). Under such circumstance, the content shown on the display corresponds to all pixels of an input image. Thus, the working resolution of the display can be considered as the resolution of the input image, instead of the physical resolution or working resolution of the display. If the resolution of the input image is higher than the working resolution of the display, then the display may usually down-sample the input image and scale it to a full-screen size for displaying. Under such circumstance, the content shown on the display also corresponds to all the pixels of an input image. Thus, the working resolution of the display can also be considered as the resolution of the input image.
However, in certain applications, such as “picture in picture” and displaying in a window mode, input images are not displayed in full screen; instead, the input image is displayed in a region of the screen, according to the actual resolution of the display under that operating status. For example, the region may be a rectangle region around screen center or close to the bottom-right corner of the screen. Therefore, the actual width of a display can be uniformed described as the physical width of the region where the image is being displayed, while the actual resolution of a display can be regarded as the resolution of the input image. In addition, in some displaying applications, a region of width X on the screen is only displayed with a part of the input image, denoted as Y, where the number of pixels in horizontal direction of Y is denoted as M. Under such circumstance, the actual width of the display can be regarded as the width X of the image displaying region, and the actual horizontal resolution of the display is the number of pixels in the horizontal direction of Y, which is M. The above-mentioned width can also be replaced by an approximation value, apart from using a highly precise value. For instance, it is acceptable that the error of an approximation value compared with the actual width does not exceed 10%.
In summary, to unify and simplify the description, in the present invention, resolution is denoted as the numbers of pixels in horizontal and vertical directions; the actual resolution of a display is referred to the resolution of the region that is used to display the image; the actual width or horizontal size of a display is referred to the physical width of the region that is used to display the image in actual. Specifically, when the resolution of the input image is equal to the physical resolution of the display, the working resolution of the display is equal to the physical resolution of the display, and the image is displayed in full screen, the actual resolution of the display is equal to the physical resolution of the screen, and the actual width of the display is equal to the physical width of the display.