1. Field of the Invention
The present invention relates to a method and system for determining convergence in an optimization system, and in particular, to methods and systems for determining convergence when registering sets of images.
2. Description of the Related Art
Medical imaging has taken on an ever increasing, if not vital, importance as a component in research and diagnostic applications in current clinical settings. The application of medical imaging can be found in the areas of planning, implementing and evaluating surgical and radio-therapeutical procedures. Imagining modalities generally fall within two categories: anatomical and functional. Anatomical modalities (i.e. depicting primarily morphology) include among others X-ray, computed tomography (CT), Magnetic Resonance Imaging (MRI), Ultrasound (US), portal images and video sequences obtained by various means, such as laparoscopy or laryngoscopy.
It should be noted that derivative techniques can also be detached from the original modalities and may appear under a separated name, such as Magnetic Resonance Angiography (MRA), Digital Subtraction Angiography (DSA), Computed Tomography Angiography (CTA) and Doppler.
Functional modalities depict data or information focused primarily on the metabolism of the underlying anatomy. It includes Single-photo Emission computed Tomography (SPECT), scintigraphy and Positron Emission Tomography (PET) (that generally constitutes the nuclear medicine imaging modalities) and fMRI (functional MRI) as well as a host of other modalities.
The information acquired from multiple imaging modalities in a clinical setting is normally of a complementary nature. A proper integration of this complementary data from the separate image sets, wherein an image set is a collection of related images, usually of the same modality and usually acquired during a single scanning session, is desired, if not required, to extract the most amount of information from the image sets. It should be noted that the image set may have been taken later in time and the time difference may be the only difference between the image set(s). This frequently occurs in situations where the growth or reduction in a cell mass is being tracked to determine if a particular treatment regimen is effective or not.
The initial step in the integration of data contained in the image sets is to bring the modalities involved into spatial alignment. This procedure is referred to as registration. After registration a fusion step is generally preformed, to provide an integrated display of the data present in the image sets.
Generally, in the registration process an image set is used as a reference while a transformation is applied to subsequent image sets in order to align any common subject matter between the image sets to match the reference set. While a variety of image registration methods exist they generally include five basic aspects: defining permissible transformations, selection of the matching features, the specification of an evaluation measure, specification of an optimization strategy and a determination as to when the search for proper alignment has converged adequately enough to be terminated.
Permissible transformations of a subsequent second image set is typically defined by an image registration process in order to specify the anticipated adjustments necessary to align the image sets. A transformation is defined as a set of movements applied to an image set, such as deformable, affine, rigid and perspective. Deformable transformations permit local deformations to the image. For example, deforming a cube into a sphere. Affine transformations (in 3D) have twelve degrees of freedom permitting translations, rotations, skewing and scaling in each of the x-, y- and z-directions. Rigid transformations provide only translations and rotations for a total of six degrees of freedom in 3D images. Perspective transformations map between images of different dimensions, for example mapping a 3D image onto a 2D image or surface.
In addition, the image registration process typically allows a user to select matching features. The matching features are the image elements that are extracted for comparison. Generally these are subdivided into voxel-based or feature-based. Voxel-based matching uses the voxel (or pixel) gray-level intensity values for comparison. Feature-based matching uses a higher-level image processing technique to extract some element of the image, for example edges.
The image registration process also typically defines an evaluation measure to determine the closeness of the match between two images. Several strategies exist for this measure and these depend on the image features being used for the match. While many features and measures exist, one of the most popular is a measure of mutual information (“MI”) to determine the closeness of fit between the gray-level intensities between two images. Some of the advantages of MI over other strategies is that it is robust, fast and can work with images that have different gray level intensity mappings such as those found in cross modality medical imaging. Cross modality medical imaging relates to processing medical images from multiple acquisition modes. For example, MI can be used for evaluation when comparing magnetic resonance images (MRI) to computed tomography (CT) images.
Since it is prohibitively time intensive to examine all possible combinations of transformations between two image sets, it is desirable for an image registration system to intelligently constrain the number of transformations performed. An optimization strategy determines the next transformation to apply in order to better align the images during a subsequent iteration. A good optimization technique will result in quick movement towards an optimal alignment. Many methods exist, examples of which are Powell's method and steepest ascent, discussed in Maes, F., et al., “Multimodality image registration by maximization of mutual information”. IEEE Transactions on Medical Imaging, 1997. 16(2): p. 187–198 and Wells, W. M. I., et al., “Multi-modal volume registration by maximization of mutual information”. Medical Image Analysis, 1996. 1(1): p. 35–51, respectively, incorporated by reference herein.
A successful image registration process also needs to determine when the alignment search has converged adequately enough to terminate the search. Existing systems often employ one of two strategies. The first strategy requires setting the number of iterations to a fixed value that is large enough to ensure convergence. Problems with this strategy are that setting too large of a value results in slow convergence, while setting the value too low results in a loss of robustness as some data sets may not converge. The second strategy examines one or more parameters, usually the evaluation measure, and determines when it has converged. This is the approach used by Powell's method, which terminates after the step size of the evaluation measure falls below some threshold (t). A problem with this strategy is that the evaluation measure may be “noisy” (a graph of the evaluation measure over time does not follow a smooth path and is very noisy), especially in the case of stochastic approximation of the mutual information. It then becomes difficult to determine convergence without a large windowed smoothing function. Another problem is that the evaluation measure often gets trapped in local minima, resulting in a false determination of convergence. For example, a graph of the evaluation measure may have one or more small areas with little change that may be misinterpreted by the system as indicative of convergence. Another method uses a measure that combines MI and gradient information. This method is described in detail in Pluim, J. P. W., J. B. A. Maintz, and M. A. Viergever, entitled “Image registration by maximization of combined mutual information and gradient information”. IEEE Transactions on Medical Imaging, 2000. 19(8): p. 809–814, incorporated by reference herein in its entirety.
The prior art and conventional wisdom have failed to provide a method that is easy to utilize while facilitating diverse convergence criteria for myriad users. The prior art is further deficient in that it focuses on only a few of the components that can indicate convergence while omitting many parameters that may not have converged or converged sufficiently.
The prior art is further deficient and lacking in that it does not take into consideration, and addresses the problem by avoiding it, the interplay between different parameters, for instance, by setting the translation movement to be bound but it does not quantify or attempt to quantify the convergence criteria for rotation or skewing. Based upon the initial selection, the other parameters which at the onset were thought to be minor can have a significant effect on convergence prediction and may even dominate the convergence prediction.