Minimally-invasive surgical techniques are aimed at reducing the amount of extraneous tissue that is damaged during diagnostic or surgical procedures, thereby reducing patient recovery time, discomfort, and deleterious side effects. As a consequence, the average length of a hospital stay for standard surgery may be shortened significantly using minimally-invasive surgical techniques. Also, patient recovery times, patient discomfort, surgical side effects, and time away from work may also be reduced with minimally-invasive surgery.
A common form of minimally-invasive surgery is endoscopy, and a common form of endoscopy is laparoscopy, which is minimally-invasive inspection and surgery inside the abdominal cavity. In standard laparoscopic surgery, a patient's abdomen is insufflated with gas, and cannula sleeves are passed through small (approximately ½ inch or less) incisions to provide entry ports for laparoscopic instruments.
Laparoscopic surgical instruments generally include a laparoscope or an endoscope (for viewing the surgical field), and working tools. The working tools are similar to those used in conventional (open) surgery, except that the working end or end effector of each tool is separated from its handle by an extension tube. As used herein, the term “end effector” means the actual working part of the surgical instrument and can include clamps, graspers, scissors, staplers, and needle holders, for example.
To perform surgical procedures, the surgeon passes these working tools or instruments through cannula sleeves to an internal surgical site and manipulates them from outside the abdomen. The surgeon views the procedure by means of a monitor that displays an image of the surgical site taken from the laparoscope. Similar endoscopic techniques are employed in, e.g., arthroscopy, retroperitoneoscopy, pelviscopy, nephroscopy, cystoscopy, cisternoscopy, sinoscopy, hysteroscopy, urethroscopy, and the like.
Minimally-invasive telesurgical robotic systems are being developed to increase a surgeon's dexterity when working within an internal surgical site, as well as to allow a surgeon to operate on a patient from a remote location. In a telesurgery system, the surgeon is often provided with an image of the surgical site at a control console. While viewing a three-dimensional (3-D) image of the surgical site on a suitable viewer or display, the surgeon performs the surgical procedures on the patient by manipulating master input or control devices of the control console. Each of the master input devices controls the motion of a servomechanically operated surgical instrument. During the surgical procedure, the telesurgical system can provide mechanical actuation and control of a variety of surgical instruments or tools having end effectors that perform various functions for the surgeon, e.g., holding or driving a needle, grasping a blood vessel, dissecting tissue, or the like, in response to manipulation of the master input devices.
During the surgical procedure, however, the surgeon may manipulate the tool so that its end effector is moved outside of the endoscope's field of view, or the end effector may become difficult to see due to occlusion by fluids or other intervening objects. In such cases it would be useful to be able to provide assistance to the surgeon in locating and/or identifying the end effector on the workstation's display screen. Accurate information regarding a tool's 3-D pose (location and orientation) can be used to provide this assistance. In general, accurate information of a tool's 3-D pose is important for a number of image guided surgical and user interface applications.
One approach that has been used to provide accurate tool tracking involves a fusion of kinematics-based pose information with image-derived pose information. Such a fusion of tool tracking information can provide the advantages of both types of data without the associated disadvantages. While kinematics joint data are usually available at a very high update rate, a kinematics estimated pose may not be very accurate due to error accumulation at each joint, with errors in joints located farther away from the tool having a greater impact on accuracy. In contrast, image-derived tool pose estimation can be highly accurate, but may run at a slower update rate that what is useful for many real-time applications. By correcting the higher-update kinematics-pose estimation using the more accurate image-derived tool pose estimation, a more accurate higher-update tool pose estimation can be obtained.
Some existing technologies have been used for surgical tool tracking. In one approach, an optical tracker is used to track the position of a marker assembly that is attached to a location on the surgical instrument outside the patient's body. However, the optical tracker requires a dedicated stereo camera and dedicated lighting, which take space in an already crowded operating room. Attaching such optical trackers also reduces the range of motion of the robotic arms due to the potential for collision. There can also be some level of error that results from propagating the 3-D pose to the surgical tool tip. Additional problems include: the extra space required, limited visibility range, the added hardware setup in the operating room, and cost. Another approach uses an electromagnetic tracker, which has its own associated disadvantages. For example, most surgical instruments have metal parts that can cause distortion, which can vary in time due to changes in distances between an electromagnetic tracker attached to one tool tip and metal components of an adjacent surgical tool. An electromagnetic tracker also involves extra cost.
Computing the 3-D pose of a rigid body with respect to a camera is a well-studied problem in computer/robot vision. A 3-D pose can be solved by starting with the known features of an object and matching these features with their 2D correspondence in the image. Features such as point and line segments are commonly used. Determination of the 3-D pose of a rigid body from a single 2D image is referred to as “pose estimation” in computer vision (see introduction in Christophe Doignon, “Scene Reconstruction, Pose Estimation and Tracking,” 2007). If using point-based correspondences, the problem is known as “perspective-n-point,” where n is the number of correspondences. Three non-collinear points provides four solutions. Four or more non-collinear points provides a unique solution.
Determination of the 3-D pose of a rigid object using a stereo camera can be accomplished using two approaches. First, the determination of the 3-D pose can be approached as an optimization problem where the 3-D pose is selected that provides the best fit between the projected 3-D points with the image correspondences in both images. In the other approach, image points in both views can be used to determine corresponding 3-D points using stereo triangulation and relative pose is determined by solving a rigid transformation between the determined 3-D points and corresponding model points. (See A. Lorusso, D. W. Eggert and R. B. Fisher, “A comparison of four algorithms for estimating 3-d rigid transformations,” 1995.)
However, a number of factors can hamper the ability to obtain an image-derived tool pose estimation. For one, an image-derived estimate is only available when the object's features are within the field of view of the imaging device(s) and they can be extracted. Some of the factors that may prevent the extraction of features include: occlusion of the features by anatomical structure or other instruments, degenerated image quality caused by fast instrument or camera motion (i.e., motion blur), adverse lighting conditions (e.g., saturation when the light is too strong, lack of contrast when the light is too weak, strong specularity due to the relative geometric configurations of the light source, instrument, and imaging device), and complex background clutter.
More reliable image-derived tool pose estimation would, therefore, be beneficial in order to increase the rate at which highly-accurate tool pose estimates are available, which in turn may help to provide more accurate overall tool tracking. Accordingly, improved methods and systems providing improved image-derived tool pose estimates would be desirable, particularly those with reduced sensitivities to adverse conditions, such as occlusions, motion blur, and adverse lighting conditions.