1. Field of the Invention
The invention relates to the production of stereo images.
The invention has applications in the fields of natural image generation by film or digital photography, analogue or digital video, movie film generation, and synthetic Image generation using methods including computer graphics or image based rendering systems, and is particularly relevant to two view stereoscopic devices including electronic and hard copy devices where more than one image of a scene is generated to create a 3D effect by showing a different image to a viewer""s left and right eyes.
2. Description of the Related Art
Applications of the invention include photogragphy, videography, movie production, electronic shopping kiosks, computer games systems for home or public use, multimedia packages e.g. encyclopaedias, medical imaging, CAD/CAM systems, scientific visualisation, remote manipulation, remote sensing, security systems and any other application where a benefit is found from a stereoscopic 3D image of a scene.
Many types of stereoscopic and auto-stereoscopic electronic displays and printing or photographic reproduction methods have been developed, for example see the following European and British patent applications: EP 0 602 934, EP 0 656 555, EP 0 708 351, EP 0 726 483, GB 9619097.0 and GB 9702259.4. The problem of image generation for these systems is less well understood and many existing stereoscopic images can be uncomfortable to view even on a high quality stereoscopic imaging device. (Where the term stereoscopic is used it should also be taken to imply multi-view systems where more than one image is generated and presented to the user even if only two of the images are viewed at any one time by the left and right eyes.)
As described in B. E. Coutant and G. Westheimer, xe2x80x9cPopulation distribution of stereoscopic abilityxe2x80x9d, Opthal. Physiol. Opt., 1993, Vol 13, January, up to 96% of the population can perceive a stereoscopic effect and up to 87% should easily be able to experience the effect on desktop 3D display systems. The following summarises some problems inherent in previous approaches to stereoscopic image generation.
Stereoscopic systems represent the third dimension, depth in front of and behind the image plane, by using image disparity As illustrated in FIG. 1. The image disparity displayed on a screen has a physical magnitude which will be termed screen disparity. Crossed disparity, dN, results in a perceived depth, N, in front of the display plane while uncrossed disparity, dF, results in a perceived depth, F, behind the display plane as illustrated in FIG. 1.
The screen disparities dn or df between homologous points in the left and right images are seen by the viewer as perceived depths N or F in front or behind the display plane. To see this effect the viewer must maintain focus on the display plane while verging their eyes off the display plane. This is thought to stress the visual image if the perceived depth value is too great and therefore limits are required for the values of N and F if comfortable images are to be produced.
These type of stereoscopic display systems do not exactly match the user""s perception in the real world in that it requires the user to accommodate (focus) on the display surface while verging their eyes away from the display surface, see FIG. 1. Since the accommodation and vergence mechanisms are linked in the brain (see D. B. Diner and D. H. Fender, xe2x80x9cHuman engineering in stereoscopic viewing devicesxe2x80x9d, 1993, Plenum Press, New York, ISBN 0-306-44667-7, and M. Mon-Williams, J. P. Wann, S. Rushton, xe2x80x9cDesign factors in virtual reality displaysxe2x80x9d, Journal of SID, Mar. 4, 1995) this requires some effort from the viewer and a greater effort the more depth is being perceived. The invention recognises that the key variable to control is perceived depth, the larger this value is the more stress is placed on the viewer""s visual system.
It is now widely recommended (see L. Hodges, D. McAllister, xe2x80x9cComputing Stereoscopic Viewsxe2x80x9d, pp71-88, in Stereo Computer Graphics and Other True 3D Technologies, D. McAlister, Princeton University Press, 1993; A. R. Rao, A. Jaimes, xe2x80x9cDigital stereoscopic imagingxe2x80x9d, SPIE Vol 3639, pp144-154, 1999; and R. Akka, xe2x80x9cConverting existing applications to support high quality stereoscopyxe2x80x9d, SPIE Vol 3639, pp290-299, 1999) that images are captured using two cameras positioned so that the only difference between the two images is the image disparity due to a horizontal translation of cameras. This arrangement is normally referred to as a parallel camera system. This avoids viewer discomfort due to keystone distortion (this arises when the cameras are not parallel because the vertical dimensions of the two images vary from one side of each image to the other) and associated vertical disparity ie. when the two images are superimposed there is varying vertical disparity across the images. In addition, for physical cameras the optics and light sensitive media must be matched to avoid unnatural intensity or geometric distortions. The latter two issues are part of a specific camera design and are not considered further here.
As illustrated In FIG. 2, the parallel camera image must be processed to ensure the depth range captured in the image disparity fits both in front and behind the display plane. This requires the use of offset sensors or film behind the lens, skewed camera frustum (in the case of computer graphics) or image cropping. Without such adjustments all depth in the images will be perceived in front of the display plane.
FIG. 2 shows that the images from parallel cameras need to be adjusted, either to have the edge of the image cropped or by the use of an asymmetric camera fustrum. The latter is possible in many computer graphics systems, or can be achieved by offsetting the image sensitive material (e.g. CCD or film) to one side in a physical camera.
The factors which directly affect perceived depth are:
For depth behind the display surface, uncrossed disparity:
F=Z/((E/df)xe2x88x921
For depth in front of the display surface, crossed disparity:
N=Z/((E/dn)+1)
From these equations it can be seen that perceived depth depends on the screen disparity, the viewer""s eye separation, E, and the display viewing distance, Z. While other methods have approximated or ignored these variables the new method allows them to be fully accounted for.
The screen disparity (dn or df) is important as it is determined by image disparity which in turn is determined by the image capture environment, including the camera parameters and the depth in the scene. The invention seeks to control image disparity and therefore screen disparity and perceived depth by controlling the camera parameters. While various previous methods to control the camera parameters have been proposed none consider the issue of directly controlling perceived depth and often approximate the parameters such as the comfortable near and far perceived depth limits for the target display.
In. S. Kitrosser, xe2x80x9cPhotography in the service of Stereoscopyxe2x80x9d, Journal of imaging science and technology, 42(4), 295-300, 1998, a slide rule type calculator is described allowing selection of camera separation given the camera details and scene near and far distances. It does not take into account different viewer""s eye spacings and the maximum image disparity is set at a predetermined value, It cannot account for the perceived depth the user sees when using a particular display and as has been discussed earlier this is the key variable in assessing the comfort of a stereoscopic image.
In L. Lipton, xe2x80x9cFoundations of the Stereoscopic Cinema, A Study in Depthxe2x80x9d, Van Nostran Reinhold Company, 1982, Lipton examines the mathematics involved in positioning cameras and develops a set of tables for different film formats giving maximum and minimum object distances for a given convergence distance and lens focal length. He assumes converging cameras will be used and the maximum screen disparity Is the same for objects in front of the screen plane as well as objects behind the screen plane. Two sets of maximum object distances are calculated; the first is where a small divergence of the viewer""s eyes is allowed (typically 1 degree) and the second where no divergence is allowed. These restrictions prevent this method from guaranteeing comfortable image generation at all times. The assumption of converging cameras ensures that some vertical disparity will be present in the image and therefore that many viewers will find the resulting images uncomfortable to view. In the internet site http://www.elsa.com/europe/press/releases/1999/graphics/revelato.htm, Elsa introduce a system called xe2x80x98Dyna-Zxe2x80x99 which dynamically adjusts the xe2x80x98spatial effectxe2x80x99. No details are currently available about the method of operation of this system, although it is limited to real time computer graphics.
The question of what the near and far perceived depth limits should be is partially addressed by existing human factors work and it is possible to deduce typical working values for the SLE VPI displays of far limit +60 mm, near limit xe2x88x9250 mm from the following studies.
In A. Woods, T. Docherty, R. Koch, xe2x80x9cImage Distortions in Stereoscopic Video Systemsxe2x80x9d, SPIE Stereoscopic Displays and Applications IV, 1993, 36-48, Woods discusses sources of distortion in stereo camera arrangements as well as the human factors considerations required when creating stereo images. These experiments show that there is a limit in the screen disparity which it is comfortable to show on stereoscopic displays. A limit of 10 mm screen disparity on a 16xe2x80x3 display at a viewing distance of 800 mm was found to be the maximum that all 10 subjects of the experiment could view.
In Y. Yeh, L. D. Silverstern, xe2x80x9cLimits of Fusion and Depth Judgement in Stereoscopic Color Displaysxe2x80x9d, Human Factors, 32(1), 1990, 45-60, Yeh shows the results of experiments carried out in order to determine binocular fusion limits. It is found that for comfortable viewing over short periods a maximum disparity of 27 minutes of arc is acceptable. With longer periods of viewing it is possible to adapt to view greater disparities but this does not indicate that large disparities are suitable for long term comfortable viewing of stereoscopic displays.
In S. Pastoor, xe2x80x9cHuman Factors in 3D Imagingxe2x80x9d, experiments by Pastoor indicate that disparities up to 35 minutes of arc do not cause any discomfort.
The invention provides a method of producing a stereo image of a (real or simulated) scene using at least one (real or simulated) camera, which creates the impression of being a 3D image when viewed on a display by a user, wherein in each position at which an image is captured the camera axis is parallel to the camera axis at all other positions, and the depth of the scene is mapped onto a maximum perceived depth of the image on the display, and the maximum perceived depth is chosen to provide comfortable viewing for the user.
The method may make use of parameters of the display, including the maximum perceived depth of an object in front of the display N, and the maximum perceived depth of an object behind the display F.
The method may make use of the user""s eye separation E, and the distance Z of the viewer from the displays.
In an embodiment of the invention, the distance Zxe2x80x2 from the camera to the Zero Disparity Plane is calculated based on the values of N and F, and also on the values of the distance Nxe2x80x2 from the camera to the closest surface in the scene, and the distance Fxe2x80x2 from the camera to the furthest surface in the scene.
In a further embodiment of the invention, the distance Zxe2x80x2 from the camera to the Zero Disparity Plane is specified, and the values of Nxe2x80x2 and Fxe2x80x2 (as defined above) are also specified, and wherein the most suitable of Nxe2x80x2 or Fxe2x80x2 is kept fixed and a new value is calculated for the other of Nxe2x80x2 or Fxe2x80x2 based on the values of N and F.
Any one of Nxe2x80x2, Fxe2x80x2 and Zxe2x80x2 may be calculated based on the values of the other two.
In a further embodiment of the invention, the maximum crossed and uncrossed disparities dN and dF, are calculated from N, F, the separation E of the user""s eyes, and the distance Z from the user to the display.
In a further embodiment of the invention, the camera separation A used to produce the stereo image is calculated based on N and F to achieve the desired perceived depth (N+F).
In a further embodiment of the invention, the field of view of the camera is adjusted to allow for that part of the image which will be cropped as a result of the use of parallel camera positions.
In a further embodiment of the invention, the camera separation is fixed, and the desired focal length of the camera is calculated to achieve the desired perceived depth (N+F).
In a further embodiment of the invention, the focal length f of the camera is taken into account when measuring the values of Nxe2x80x2 and Fxe2x80x2 (as defined above).
In a further embodiment of the invention, the camera separation A is limited to the user""s eye separation E multiplied by the scale factor S as herein defined.
The invention also provides a computing device adapted to carry out a method as described above.
The invention also provides a camera or camera system comprising such a computing device.
In order to produce stereoscopic 3D images which are comfortable to view the positioning of cameras. (real or synthetic) requires great care. Stereoscopic displays are only capable of displaying a limited perceived depth range while there may be almost any depth range in a scene to be captured. The new method allows accurate positioning of parallel cameras for capturing stereoscopic images which exactly fit the depth in the scene into the perceived depth limits of the display. Information about the stereoscopic display for which the images are intended is used, along with information about the viewer of the display, the scene to be captured and the camera type to be used.
A chosen depth range in a scene being imaged is exactly mapped to a pre-defined perceived depth range on a stereoscopic 3D display. This ensures that a comfortable perceived depth range such as defined by human factors considerations is never exceeded when viewing the stereoscopic 3D images.
The near and far limits of the perceived depth range can be set independently allowing for precise control of the amount of depth effect seen by the user. This allows the images to be adapted to variations in viewer""s perception of crossed and uncrossed screen disparities and enables precise artistic control of the 3D image composition.
In comparison to approximate methods or trial and error the new method of camera control ensures that a scene is always mapped onto the defined perceived depth range. This results in an easy to use method which does not require numerous image adjustments to ensure the depth viewed is within comfortable limits.
The new method can be implemented in software or hardware allowing it to control physical digital or photographic cameras or synthetic computer graphic cameras.
Depth measurements are taken to the known film plane (in the case of physical cameras) avoiding problems estimating the position of lens aperture positions. This is particularly important with zoom lenses where the apertures moves unseen to the user.
Camera parameters can be calculated instantly, which is useful for real-time computer graphics and control of video cameras in particular. Especially when compared with the use by previous methods of slide rules or printed tables of results.
Parallel cameras are supported which means simple mathematical equations can be used reducing computation requirements and the number of control variables.