One of the main problems associated with stereoscopic television is the disruption of the normal correlation between human eye accommodation and vergence between the two eyes of the viewer. Specifically, in normal visual experience, the human eyes are accommodated (i.e., focused) to the object of observation and, at the same time, the two eyes are converged on the same object. Therefore the object of observation is projected on corresponding areas of the two retinas with no disparity. All of the objects in front of the object of observation will have crossed disparity and will be sensed as “closer”, whereas all of the objects behind the object of observation will have uncrossed disparity and will be sensed as “farther away”.
However, this correlation between the focusing of the eyes and their convergence is usually disrupted in stereoscopic video applications. In this case, the left and right representations of objects are physically located on the surface of the monitor as opposed to arbitrary places in space for regular visual experiences. So, in order to obtain the best focus, the eyes need to optically focus at the monitor. However, the vergence of the two eyes is dictated by the parallax generated between the left and right images on the monitor and generally does not correspond to the eye accommodation for the best focus in the monitor plane. See FIG. 1, which illustrates a typical stereoscopic scenario in which there is a deviation in the normal correspondence between focal distance and vergence distance. This break in the linkage between eye focus and convergence causes eye strain and fatigue for the viewer. See also FIG. 2, which illustrates how the relationship between focal distance and the vergence distance should remain within certain bounds in order for the viewer to remain in their “zone of comfort” (i.e., the so-called “Percival's zone of comfort”).
The problem described above becomes particularly important in medical (e.g., endoscopic) applications where the stereoscopic video system may be used for precision viewing for prolonged periods of time. By way of example but not limitation, it is not uncommon for surgical cases to last over 2 hours, and typically a surgeon performs at least several cases a day. Due to the critical nature of such medical applications, it is important to minimize user fatigue and provide for comfortable visualization while retaining all of the benefits of depth perception.
FIG. 3 shows a first-order optical layout of a typical dual-channel stereo camera 5. Dual-channel stereo camera 5 generally comprises a left image sensor 10L and a right image sensor 10R (e.g., CCD or CMOS sensors), and an optical system 15 comprising a left channel optical system 20L and a right channel optical system 20R (shown schematically in FIG. 3 as two single lenses for each optical channel for clarity of illustration purposes only). As is well known in the art, and looking now at FIG. 4, dual-channel stereo camera 5 is intended to be coupled to an endoscope 21, the signals generated by image sensors 10L, 10R are forwarded to an appropriate electronic system 22 for processing (the electronic system 22 may be included within stereo camera 5), and then the processed signals are forwarded to an appropriate stereo display 23 or recording device configured to display or record the left and right images captured by the left and right image sensors 10L, 10R. This display device 23 may be a 3D monitor of the sort well known in the art, or a head-mounted display, or any other display device capable of presenting the left and right images to the appropriate eye of the viewer.
In FIG. 3:
P1 and P2 are the first and second principal planes of the left and right channel optical systems 20L, 20R—in the first-order approximation, the left and right channel optical systems 20L, 20R are considered identical and their corresponding principal planes coincident;
O is the median axis of the dual-channel stereo camera 5;
OL and OR are the optical axes of the left and right channel optical systems 20L, 20R, respectively;
f is the effective focal length of the dual-channel stereo camera 5;
s and s′ are the distances from an object and its image to the corresponding principal planes—by the sign convention generally accepted in the optical field, distances measured to the left from a principal plane are considered negative and distances measured to the right from a principal plane are considered positive—thus, in FIG. 3, distance s is considered to be negative whereas distance s′ is considered to be positive;
F is the back focal point of the dual-channel stereo camera 5;
x′ is the distance from the focal point F to the image plane;
C is the point of convergence (see below);
h is the distance between the median axis O and the optical axis of the right channel optical system 20R—by the sign convention, the heights measured below the optical axis are considered negative while the heights measured above the optical axis are considered positive; and
h′ is the image height for the point of convergence.
Typically, a dual channel stereo camera is aligned for a certain point of convergence in the object space. The alignment is achieved by offsetting image sensors 10L, 10R in the “horizontal plane” of the eyes, i.e., the “horizontal plane” represented by the line 25 in FIG. 3. It can be seen from FIG. 3 that the centers of sensors 10L, 10R are offset horizontally from the optical axes OL, OR of the left and right optical systems 20L, 20R so that the point of convergence is imaged at the centers of each corresponding sensor. Owing to such an arrangement, the point of convergence is displayed with zero parallax on the display device, so for this particular point, the link between the eye accommodation and the eye convergence will be preserved, and for this particular point of convergence, the dual-channel stereo camera will provide the viewer with a “normal” visual experience.
Typically the point of convergence is selected so as to be within the usable range of the object distances which are expected to be encountered in a particular application. For instance, point C may be chosen to be at a distance of 5 m from the optical system for a typical camcorder application, or at a distance of 50 mm from the distal tip of an endoscope for a general surgery laparoscopic application. Similarly, the distance between the optical axes OL, OR of the left and right channel optical systems 20L, 20R, the focal lengths of the left and right optical systems 20L, 20R, and the types/sizes of image sensors 10L, 10R are typically selected in accordance with the application for which the stereo camera is to be used.
The drawback of a conventional stereo camera is that when the camera is focused to any other point which is at a distance different from the point of convergence, then the point in the center of the display device will have non-zero parallax, thereby breaking the normal link between eye accommodation and convergence. This break in the normal link between eye accommodation and convergence causes eye strain and fatigue for the viewer.
In some situations, e.g., where the conventional stereo camera only needs to be used for brief periods of time, and/or where it is not necessary to view an image with significant precision, and/or where the parallax is relatively nominal, this break in the normal link between eye accommodation and convergence may cause only modest levels of eye strain and fatigue for the viewer and a conventional stereo camera may be acceptable. However, in medical (e.g., endoscopic) applications where the stereo camera must be used for long periods of time, with great precision and where the parallax is frequently substantial, the break in the link between eye accommodation and convergence may cause significant levels of eye strain and fatigue for the viewer, and a conventional stereo camera may be unsatisfactory.
Thus there is a need for a new and improved stereoscopic visualization system which can address the foregoing issues of convergence in medical (e.g., endoscopic) and related applications.