The subject invention is a scanning system and display and more particularly a scanning system and display for detecting and quantifying similarities or differences between stored data (or images) and data collected from scans or from other scans.
There are many disparate fields in which human senses are used as measuring tools. These fields are often specialized, technical, or industrial, and yet workers in these areas still acquire and gauge data using one or more of their biological senses. Clearly, given the technical nature and great importance of many of these fields, a need exists for a system that can rapidly, accurately, precisely, and objectively acquire and measure scanned data and automatically compare these data to a standard, should such a standard exist or to one or more previous scans.
Systems have been developed for use in assessing medical conditions using various imaging systems, such a surface scanning systems, as well as systems using X-rays, gamma rays, radio-frequency waves, electron-position annihilation, electron, ions and magnetic particles (hereinafter referred collectively as “deep diagnostic scanning devices” producing “deep images” internal to the surface of the object). Systems for use in analyzing skin and joints have used baseline images for comparing with a current scan of the skin or joint. Such comparisons usually operate by the physician observing the scans. Deep diagnostic imaging, such as of the spine of a patient using radio-frequency waves (MRI) or X-rays (CT) systems, is often useful for identifying disease or an injury, such as to the spine itself, or as a readily locatable landmark for other tissues. Present practice is to take and digitally store lots of data on a patient, including internal images, to both compare each patient's data to his/her own data, and “pools” of data from other people. Digital geometry processing techniques are often used to generate a 3D image of the inside of an object from a large series of 2D images taken around a single axis of rotation. Such series of CT images (“tomographic images” or “slices”) are generally taken perpendicular to the longitudinal axis of the object such that each slice is oriented perpendicularly to the axis. Alternatively, the slices can be taken parallel with the longitudinal axis creating a series of longitudinal slices.
One problem encountered with all such scanning systems is the ability to properly analyze pictures or scans taken at different times, and using different types of scanning systems (modalities) and presents a problem which is presently beyond the reach of most automated systems. Additional, complications are presented by variations in quality between images, incomplete images, failure to adequately capture portions of the tissue due to congenital defect, and disease, injury or other conditions, such as caused by surgery. Thus, analysis of images and prescription of additional data collection and treatment currently requires an extensively trained technician. Further, it is often helpful to be able to obtain a one-to-one correspondence between the readily visible and markable skin/surface and underlying structures or pathology detectable by a variety of imaging devices (modalities). This may also facilitate clinical correlation, XRT, image guided biopsy or surgical exploration, multimodality or interstudy image fusion, motion correction/compensation, and three-dimensional (3D) space tracking. However, current methods, (e.g. bath oil/vitamin E capsules for MRI), have several limitations including single image modality utility requiring completely different and sometimes incompatible devices for each modality, complicating the procedure and adding potential error in subsequent multimodality integration/fusion. They require a separate step to mark the skin/surface where the localizer is placed and when as commonly affixed to the skin by overlying tape, may artificially indent/compress the soft tissue beneath the marker or allow the localizer to move, further adding to potential error. Sterile technique is often difficult to achieve. Furthermore, it may be impossible to discriminate point localizers from each other or directly attain surface coordinates and measurements with cross sectional imaging techniques. In regards to the latter, indirect instrument values are subject to significant error due to potential inter-scan patient motion, nonorthogonal surface contours, and technique related aberrations which may not be appreciated as current multipurpose spatial reference phantoms are not designed for simultaneous patient imaging.
One process that can has been utilized to acquire and align surface images and compare these images is by pose estimation, particularly when the scan requires acquiring data of a 3D object. Pose estimation is a process that determines the position and orientation of known objects in 3D scenes relative to a remote observer. Humans perform pose estimation on a regular basis. Anytime somebody picks up a pencil or parallel parks an automobile they are using pose estimation to determine how to orient their hand properly to pick up the pencil or to initiate a trajectory that provides the best opportunity for parking successfully. In these two cases, the pose is determined using visual sensing, i.e. stereo vision and tactile feedback, but pose can be also derived from audio, radar, and other measurements that provide relative 3D position. Accordingly, pose estimation plays a significant role in a human's ability to interact with its environment, whether that environment is static or dynamic.
Pose estimation has been used in some computer vision applications for robotic or autonomous systems, where the system attempts to perform operations that are natural to humans. These applications include, but are not limited to, object identification, object tracking, path planning, and obstacle avoidance. Potential applications using pose estimation can be as simple as an industrial robotic system identifying a particular part from a bin of many different parts for picking up and loading into a machine, or as complex as autonomous aircraft flying in formation while navigating a terrain, or a spacecraft performing autonomous rendezvous and docking with a non-cooperative spacecraft by identifying docking features, developing an interception plan, and executing the plan. These applications however all require real-time pose estimation. Further, such systems for object pose estimation typically require that various landmarks or features (such as points, lines, corners, edges, shapes, and other geometrical shapes) must be identified and selected. A pose can then be made and registration performed using such identified references. Accordingly, such a system requires an object to have pre-identified features. Further, such methods often have difficulty with objects having same features but with different dimensions. Care must be taken in selecting such features as some objects may have the identified features but different non-identified features which could result in error.
Clearly, pose estimation can be applicable for certain robotic or autonomous systems, it also has other applications such as surface alignment. For example, surface alignment takes 3D surface measurements of multiple instances of the same object with different poses relative to the observer and applies a rigid transformation to the measurements so that each instance has the same pose relative to the observer. Surface alignment allows for automatic comparison such as defect detection in high production factory settings if one of the instances serves as a “truth” model. It also allows for the generation of complete 3D surface measurements of an object by stitching multiple 3D surface measurements from varying viewpoints. Varying viewpoints are required to generate a complete 3D surface measurement, because some regions of the surface are always occluded by others. With pose estimation, surface alignment can be performed with no knowledge of the relative pose of each surface as long as some overlap exists between the individual 3D surface measurements.
Other systems have been developed to automatically make comparisons of scanned images, the patients joint or body part being scanned are required to be immobilized using a specialized mold or jig in order to ensure proper alignment of the images for registering points on the images for making proper comparisons. Such immobilization is difficult for certain body regions and makes scanning problematic if such scans are being done at different locations. Further, the patients may require different methods for immobilization making the process more complex, time consuming, and expensive. While it may be possible to automatically make comparisons of scanned images of a patient, such systems require precise positioning of the patients joint or scanned area which again is time relatively complex, consuming and expensive. Further, such systems often require that the scanner making the scan must be consistently aligned with and/or consistently positioned relative to the surface being scanned. Other systems have been developed that require the physician to make artificial references on the surface of the patient being scanned for registering to allow for the proper alignment of the images.
Many pose estimation algorithms exist in literature, but it has now been found that the spin-image pose estimation algorithm provides the most accurate results while being robust to sensor noise and placing no surface restrictions on the object of interest other than it must be pose distinct (i.e. unlike a sphere). It also places no restrictions on the measurement technology other than it must generate a 3D surface mesh. Although the spin-image algorithm is accurate, like other robust pose estimation algorithms, the algorithm is computationally complex and the time required to compute a pose estimate is relatively long. Unfortunately, the relatively long computational time makes it inadequate for the many engineering applications that require a robust real-time pose estimation algorithm.
The fundamental principal behind a spin-image algorithm is to provide an efficient method for representing and matching individual points of an object's surface. It should be understood that by comparing and matching spin-images one is actually comparing and matching surface points. This representation is called a spin-image. By matching the spin-images of surface points in an observed scene (scanned image) to the spin-images of surface point of the “truth” model (reference image), surface point correspondences can be established. It should be understood that the truth model can be a scan, CAD model, mathematically defined surface, and the like. This matching procedure requires that each scene spin-image be compared to all reference spin-images by determining the linear correlation between the spin-images called the similarity measure. This is one of the most time-consuming portions of the algorithm and until now makes the spin-image pose estimation algorithm impractical for many applications. For an example, a typical spin-image is a 16×16 pixel image. Therefore, the spin-image is represented by an array of 256 numbers or “counts” at each of the 256 squares forming a grid over the image. To check for matches of spin-images, the 256 numbers in the spin-image for each point in the scene image must be compared to the 256 numbers in each reference spin-image. If the 3D scene image consists of a million points, and the reference spin-image also contains a million points, therefore there are a million of these 256 comparisons that must be made (256 million comparisons to check if the spin-image for one point in the scene spin-image matches the spin-image for one of the points in the reference spin-image). If multiple scene image points are to be compared to the full set of reference spin-image points, then the number of comparisons must be multiplied by the number of scene spin-image points to be matched. Therefore, spin-images with a larger number of grid squares (such as 32×32) results in even more computations to compare spin-images. Unfortunately, as a result of such a large number of comparisons that must be made, this method of using spin-image comparisons cannot be used for real-time pose estimation and is therefore not practicable for many applications.
Systems have also been developed, such as disclosed in U.S. patent application Ser. No. 13/373,456, filed on Nov. 15, 2011, and incorporated in its entirety by reference, functions by describing a 3D surface with a collection of 2D images (spin-images). The system operates such that the spin-images are represented by a substantial reduction of numbers (256 pixels are generally represented by less than 10 numbers) thus allowing for substantially quicker pose estimations. When two spin-images are compared a similarity measure (score) is generated that indicates their similarity such that the higher the score the more similar are the images. Unlike traditional methods that treat all matches equally, the subject process examines all matches based on the similarity measure and groups them using the score. The process uses the match with the highest score, creates a group and then estimates a pose. If the pose error is less than the error threshold, the process ends and the pose is estimated. If the pose error is greater than the error threshold, the process uses the match with the next highest similarity measure and repeats the process until it obtains a pose error that is less than the threshold. This process significantly increases the speed of the process such that real time comparisons can be made.
In view of the foregoing, it is apparent that a need exists for a scanning system and display for detecting and quantifying similarities or differences between collected data or images obtained from the object of interest and stored data, images, and/or virtual images, or from a previous scan and which can operate in a relative short amount of time and preferably in relative real time. Further, a need exists for a system that allows objects to be scanned without the need or with a reduced need for the object to be immobilized with a specialized mode or jig when scanned or the scanner to be in the same position relative to the object for each scan, thus placing no restrictions of how the object being scanned is positioned relative to the scanner. In addition, a need exists for a system and display that can operate to obtain a one-to-one correspondence between a readily visible skin/surface and underlying structures or pathology detectable by a variety of imaging modalities and which can operate to facilitate clinical correlation, XRT, image guided biopsy or surgical exploration, multimodality or interstudy image fusion, motion correction/compensation, and 3D space tracking.