Field of Invention
The present invention relates to a technical field of video display processing, and more particularly to a multi-viewpoint parallel synchronous scaling engine of a multi-viewpoint 3D (3-dimensional) display technology which is easy to be achieved by hardware, and a method thereof.
Description of Related Arts
Stereoscopic display technology can project the appropriate perspectives of a 3D image in many directions simultaneously, viewers can get 3D stereoscopic experiences at the same time without the need for special glasses or eyes tracking, which has a bright market prospects. Conventionally, commercial 3D display technology is mostly based on the principle of human binocular stereoscopic vision, which allows the left and right eyes to receive view field images of different viewpoints. Because of the slight difference between the view field images of different viewpoints, integration thereof by brain produces a 3D sense for the audience. Compared with the conventional glasses-wear 3D display, the multi-view 3D display gets rid of the 3D glasses for receiving 3D effect, and thus with more market advantages. Conventionally, the multi-view 3D display is mainly categorized into: parallex barrier, cylindrical lens display, volume display and holographic display. Optical components are mounted above the display panel to discretize the light field into multiple narrowly spaced views to create the illusion of continuous parallex. In the present invention, the multi-view 3D display refers to the parallex barrier display and the cylindrical lens display if no specific illustration is provided.
Referring to FIG. 1, 1080P four-viewpoint 3D display with integer pixel arrangement format is provided as an example, for briefly illustrating processing procedure thereof. Four view field image resolutions of a 1080P four-viewpoint 3D source are all 960×540, and are arranged in a four-grid form. The corresponding display processing method comprises the following steps of:
1) segmenting images of four sub view fields, for obtaining four sub images (a, b, c and d) with resolutions of 960×540;
2) interpolating the resolutions of each sub image for zooming into a physical resolution (1920×1080) of a display terminal, so as to obtain zoomed images (A, B, C and D) of each sub view field;
3) according to a weighting relationship between a correlation coefficient of a parallex barrier or a cylindrical lens and the viewpoint, calculating and combining sub-pixels at positions corresponding to A, B, C and D, for obtaining display pixels of a multi-view 3D image at the corresponding positions; and 4) completing terminal display of the multi-view 3D image obtained by combining.
Referring to FIG. 2, through a corresponding optical path selection effect between the parallex barrier or the cylindrical lens and the multi-view 3D combined image, different view field images are observed from different angles and distances. Because the space between different eyes of user is about 5.5 cm, the right and left eyes will receive different views if the audience in a proper position, so as to provide 3D scene after the images are combined by the brain. It should be noticed that the four-viewpoint integrated arrangement in the FIG. 2 is only one demonstration of pixel arrangement of multi-view 3D combined image.
Referring to FIG. 3, a multi-view 3D display processing system corresponding to the above method is illustrated, which comprises: an input video decoding module, an N-viewpoint sequence generation module, a video image frame storage and control module, and a multi-view stereoscopic image generation module, wherein the multi-view stereoscopic image generation module comprises a scaling engine; interpolation pixel window of each viewpoint image is inputted into the scaling engine, and display pixels of the combined stereoscopic image are outputted.
Referring to FIG. 4, a method of the scaling engine in the conventional multi-view 3D display (N-views) system is illustrated, comprising the following steps of:
1) respectively obtaining image data of each sub view field from DRAM (comprising SDRAM, DDR2 SDRAM, and DDR3 SDRAM), according to a relevant scaling algorithm, obtaining interpolation pixel window data needed by interpolation calculation of a current interpolation pixel point of each sub view field;
2) according to coefficients of the corresponding interpolation algorithm, processing the interpolation pixel window corresponding to each view field with interpolation calculation by N interpolation modules, so as to obtain N zoomed pixel; and 3) according to the pixel arrangement requirement of a display terminal, combining pixels of N interpolation results of the N sub view fields through a multi-view 3D video image combination and calculation module, for obtaining a display pixel combining result at current position; repeating the above steps until all pixels of a frame are combined, and displaying the combined multi-view 3D images on a multi-view stereoscopic display terminal.
Referring to FIG. 3 and FIG. 4, N-viewpoint 3D display is illustrated as an example, for specifically describing the above steps:
Firstly, a video signal (analog signal or digital signal) is inputted into the input video decoding module for generating a video digital signal (RGB\YUV\RGBY signal) and corresponding synchronizing signals.
For displaying the 3D effect, a plurality of sub view field sequences should be obtained by image segmenting or 2D-3D converting with the N-viewpoint sequence generation module. Then the video image data are stored into DRAM (comprising SDRAM, DDR2 SDRAM, and DDR3 SDRAM) through the video image frame storage and control module. After the video image data are stored into the DRAM, the resolution of each sub view field image is interpolated to the physical resolution of the display terminal (such as 1080P, 4K, and 8K), and N interpolated images are with the same resolution as the display terminal are obtained. In the above process, the multi-view stereoscopic image generation module comprises the scaling engine. An operation method thereof comprises steps of: obtaining the image sequence of each sub view field from the DRAM, storing into an on-chip memory; then according to the corresponding interpolation algorithm, obtaining interpolation pixel window data needed by interpolation module calculation; providing parallel point-to-point interpolation calculation by each view interpolation module according to the interpolation algorithm and the corresponding interpolation pixel window; for the sequence of the N sub view field images obtained by interpolation module, according to 3D sub-pixel arrangement requirements of the parallex barrier or cylindrical lens of the display terminal, processing R\G\B (or Y\U\V, R\G\B\Y) sub pixel points at the corresponding positions of interpolated results of each view with the multi-view 3D video combination and calculation module, thereby obtaining display pixels of the multi-view 3D image at the corresponding positions.
Finally, according to an interface of the display terminal and a corresponding encoding method, the above multi-view 3D combined image data are sent to the display terminal, for multi-view 3D display. The above steps are repeated until all pixels of a frame are processed.
Shortcomings of the conventional methods are as follows.
With the continuous resolution upgrade of display terminal, the number of viewpoints of the multi-viewing 3D video source is increasing. As a result, view experience is improved and more users are accommodated at the same time. In the conventional methods, a series of individual scaling modules for each sub-view are instantiated, the images of each sub view need to be interpolated separately, and N view fields need N independent interpolation modules. However, during combination of multi-view 3D sub-pixel, only a part of the interpolated results of each sub view field are needed. Therefore, a plurality of interpolation modules calculates a large amount of unused redundant data, which wastes a lot of hardware calculation as well as memory resources. In addition, with the further increase of the number of viewpoints, huge hardware calculation resource consumption will finally makes it impractical.