The present invention relates to methods and systems for determining depth information relating to a scene so that a three-dimensional image of the scene can be displayed on a display device. More particularly, the present invention relates to methods and systems for real-time structured light depth extraction and an endoscope using real-time structured light depth extraction.
In computer imaging systems, it is often desirable to determine depth information relating to an object or scene so that a three-dimensional image of the object can be displayed on a display device. One method for determining depth information is stereo depth extraction. In stereo depth extraction, two or more cameras are utilized to view an object. Determining the distance of the object from the cameras requires that both cameras focus on the same feature. This method is useful in determining depth of uncomplicated objects where all corners and edges of an object are well pronounced in a scene. However, curved edges, shading, non-planar surfaces, and uneven lighting make stereo depth extraction difficult because these conditions may prevent identification of a common feature that both cameras can resolve.
Another conventional method for extracting depth information from a scene is laser scanned depth extraction. In laser scanned depth extraction, a laser line is projected across a surface and viewed off-axis using a camera. Provided that the locations of the laser and the camera are known, scanning the laser line across the surface of the object allows a computer to build a three-dimensional depth model of the object. One disadvantage associated with laser scanned depth extraction is that the time for scanning a laser across the entire surface of an object makes this method impractical for real-time depth extraction systems.
In order to increase the speed at which depth information can be extracted from a scene, structured light depth extraction methods have been developed. In structured light depth extraction, a projector projects known patterns of structured light, such as lines, circles, bitmaps, or boxes, onto an object. A camera is positioned off-axis from the projector to sample light reflected from the object. A computer connected to the camera and the projector calculates depth information for the object of interest based on the projected light patterns, the reflected light patterns sampled by the camera, the position and orientation of the camera, and the position and orientation of the projector.
In early structured light depth extraction systems, slide projectors were utilized to project structured light patterns onto an object of interest. In order to project a plurality of patterns onto the object, a human operator manually placed slides containing different patterns in the slide projector. The slide projector projected the structured light patterns onto the object. A camera positioned off-axis from the slide projector sampled the reflected light for each structured light pattern. The sampled images were input into a computer that calculated depth information for the object. While these early systems were capable of accurate depth calculations, they were too slow for real-time updating of a displayed image.
More recently, structured light depth extraction has been performed using video projectors capable of changing structured light patterns about twice per second, resulting in updating of a displayed three-dimensional image about once every eight seconds. These structured light depth extraction systems may be capable of determining depth information more rapidly than conventional structured light depth extraction systems or laser scanned depth extraction systems. However, these systems are still too slow for real-time applications.
One application in which it may be desirable to use structured light depth extraction is endoscopy, where it is desirable to display a real-time image of the interior of a patient""s body. In endoscopic surgery, an endoscope including or connected to a camera is inserted in a first incision in a patient""s body, while a surgeon operates through another incision in the patient""s body. The surgeon views the image seen by the camera on a video screen in order to guide surgical instruments in performing the operation. The image displayed on the video screen must be updated in real time, such that movements of the patient and the surgeon are reflected in the image with minimal latency. Currently, video cameras used in endoscopic surgery produce an image that is updated 30 times per second. As stated above, conventional structured light depth extraction systems are capable of updating a displayed image only about once every eight seconds. Thus, conventional structured light depth extraction systems are too slow for endoscopic surgical applications.
Another problem associated with applying structured light depth extraction systems to endoscopic surgery is that objects inside a patient""s body are often wet and thus produce bright specular reflections. These reflections may saturate the phototransistors of a camera sampling the reflections. Saturating the phototransistors of the camera may lead to inaccurate reproduction of the scene. As a result, conventional structured light depth extraction is unsuitable for endoscopic surgical applications.
Conventional endoscopes include or are connected to one or more cameras that allow the surgeon to view the interior of the patient""s body without utilizing structured light depth extraction. A single-camera endoscope is incapable of communicating depth information to the surgeon, unless the camera is continuously moving. Such continuous motion may make some tasks more difficult, may require a robot arm to guide the camera, and may result in trauma to the patient. In an alternative method, in order to determine depth information using a single-camera endoscope, the surgeon may either probe objects with an instrument or move the endoscope to different locations in the patient""s body. Such probing and movement inside the patient""s body is undesirable as it may increase trauma to the patient. Stereo endoscopes are capable of showing depth information; however, such devices may not accurately provide depth information with regard to complex rounded objects, such as structures inside a patient""s body. Stereo endoscopes are generally used to directly display stereo images to a surgeon. In addition, conventional stereo endoscopes are large in cross-sectional area, thus requiring larger incisions in the patient.
Another problem associated with conventional endoscopic surgical instruments is that the camera may not view an object from the same direction that the surgeon is facing. As a result, movements of a surgical instrument viewed on the display screen may not match movements of the surgeon""s hands operating the instrument. Thus, the surgeon is required to have excellent hand-eye coordination and experience in operating a conventional endoscope.
In light of the problems associated with conventional endoscopes and the inability of conventional structured light depth extraction systems to provide depth information in real time, there exists a need for real-time structured light depth extraction systems and endoscopes including real-time structured light depth extraction systems.
An object of the invention is to provide a real-time structured light depth extraction system and an endoscope having a real-time structured light depth extraction system.
Another object of the present invention is to provide an endoscope with a shared optical path for multiple optical signals so that the cross-sectional area of the endoscope can be made smaller.
Another object of the present invention is to provide an augmented reality visualization system for endoscopic surgery having a real-time structured light depth extraction system.
According to a first aspect, the present invention includes a real-time structured light dept extraction system. The system includes a projector, a camera, and an image processor/controller. The projector includes a light source and a display screen. The display screen displays first and second reflective patterns. The second pattern is the inverse of the first pattern. The light from the light source reflects from the reflective patterns to create structured light patterns that are projected onto an object of interest. The camera samples light reflected from the object during projection of both the first and second structured light patterns and outputs digital signals to the image processor/controller. The image processor/controller processes the digital signals and extracts depth information of the object in real time.
As used herein, the phrase xe2x80x9creal-timexe2x80x9d refers to perceived real time from the point of view of a human observer. For example, in a real-time structured light depth extraction system, depth information relating to an object being viewed is determined at a sufficiently high rate for updates to a displayed image to appear continuous to a human observer. In order to appear continuous to a human observer, the updates may occur at a rate of at least about 10 updates per second. More preferably, the updates may occur at a rate of at least about 15 updates per second. Even more preferably, the updates may occur at a rate of at least about 30 updates per second. An update rate of 30 updates per second corresponds to a standard video frame rate.
As used herein, the phrase xe2x80x9cdepth informationxe2x80x9d refers to information relating to the distance between an object being viewed by a camera and the camera image plane. The depth information may be the actual distance value or information intermediate to calculating the actual distance value.
According to another aspect, the present invention includes an endoscope having a shared optical path for multiple signals. For example, in endoscopes that use real-time structured light depth extraction, optical signals from the projector and optical signals reflected from the object of interest may share an optical path within the endoscope. In stereo endoscopes, optical signals reflected from an object and entering the endoscope through separate objective lenses may share a common optical path within the endoscope. In order for different optical signals to share a common path inside the endoscope, the optical signals are polarized in directions that are angularly offset from each other. The amount of offset is preferably 90 degrees, in order to enhance contrast between the optical signals.
According to another aspect, the present invention includes an augmented reality visualization system for endoscopic surgery that utilizes real-time structured light depth extraction to determine depth information relating to the inside of a patient""s body. A graphics generator generates synthetic images to be merged with real images and displayed to a viewer, such as a surgeon. An image merger merges the real and synthetic images such that the objects in the final images have proper occlusion relationships. A display, such as a head-mounted display, displays the merged images to the viewer.
While some of the objects of the invention have been stated hereinabove, other objects will become evident as the description proceeds, when taken in connection with the accompanying drawings as best described hereinbelow.