A variety of medical imaging techniques are known, including magnetic resonance (MR) imaging, X-ray computed tomography (CT), radionuclide imaging, optical imaging, and ultrasound (US). Other imaging techniques may be developed in the future. These imaging techniques may produce a two-dimensional (2D) array of pixels (a conventional image) or a three-dimensional (3D) array of voxels, which conceptually represent slices through a physical object. Each pixel or voxel is assigned a value or “intensity” related to one or more physical properties of tissue at a particular point, peculiar to the particular imaging method used. The term “image” as used herein encompasses both 2D and 3D data sets unless the context indicates otherwise.
In some situations, it is desirable to be able to perform multimodal image registration, i.e. aligning images of the same body region but obtained through different imaging techniques. This is often highly challenging due to the large differences in the intensity characteristics between images obtained using different imaging techniques. In addition, fundamental differences between the underlying physics and image formation processes peculiar to each imaging method may also give rise to modality-specific artefacts. A further problem is that for a deformable structure, which includes most of the soft tissue organs of the body, physical deformations and motion with respect to neighbouring structures may occur between imaging sessions. These effects further complicate the problem of image registration.
One well-known approach to image registration involves so-called intensity-based algorithms, such as those which seek to maximise information-theoretic similarity measures. These techniques implicitly assume a probabilistic relationship between the intensities in one image and those in the corresponding regions of another image for mapping one intensity map to another. However, this assumption is often not reliable in a situation where different imaging methods that exploit different physical properties are used to obtain an image of the same anatomical region.
In an alternative approach, commonly referred to as feature-based registration, the input images are first reduced to simpler geometric representations (such as a set of points or surfaces) and these geometric representations are then registered with one another. This approach typically involves identifying corresponding features, such as anatomical landmark points, tissue boundaries, etc, in each image. The process of extracting features from image data, known as image segmentation, can be performed using segmentation software and may in some cases involve little or no user interaction. However, in many other cases, the segmentation must be performed manually by an expert observer. Therefore, the feature-based approach to registration is often impractical if available computer-based automatic segmentation methods are unavailable or fail, or if manual segmentation of at least one of the images is prohibitively time-consuming and labour-intensive.
The reliance on feature-based image registration is a particular problem in time-critical applications, such as image-guided surgery, since images obtained during such a procedure are typically of much poorer quality than those obtained outside the surgical setting. These image are therefore very often difficult to segment automatically or within a clinically acceptable timescale (i.e. seconds to a few minutes).
Since ultrasound imaging is safe, non-invasive, inexpensive, portable and widely available in hospitals, it is used routinely to provide real-time surgical guidance during a wide range of medical procedures. However, there is currently a pressing clinical need for multimodal image registration methods that enable ultrasound images to be accurately registered with other types of image to enable accurate guidance of many procedures by visually augmenting ultrasound images with anatomical and pathological information derived from diagnostic quality images (especially MR and X-ray CT images). Such information includes the location of pathology (e.g. a cancerous tumour) or organs that are not visible in the ultrasound images obtained during a procedure (for example, because they are poorly visualised or lie outside the field-of-view of the image) or a representation of a treatment or biopsy sampling plan that has been defined using information derived from images acquired specifically for the purposes of disease diagnosis or surgical planning combined with diagnostic information from other sources.
If multimodal image registration can be performed accurately, the location of a tumour identified in an MR image, for example, can be displayed superimposed on ultrasound images ordinarily obtained during a surgical procedure for the purposes of guiding surgical instruments. This aids the clinician by providing visual information on the location of the tumour relative to the current position of surgical instruments, so that tissue biopsy samples can be collected from precise locations to confirm a diagnosis, or an intervention to treat the tumour can be performed with sufficient accuracy that the tissue within a region that encloses the tumour plus a pre-defined surgical margin are destroyed or removed. However, if the diagnostic image information is not accurately aligned with intra-procedural images, errors may be introduced that limit the accuracy of the biopsy as a diagnostic test or that can severely limit clinical efficacy of the intervention. In practice, such errors include: inaccurate placement of biopsy needles, failure to remove an adequate margin of tissue surrounding a tumour such that malignant cancer cells are not completely eradicated from the organ, and unnecessary damage to healthy tissue with an elevated risk of side-effects related to the procedure in question.
Unfortunately, standard intensity-based multimodal registration algorithms are known to perform poorly with ultrasound images, largely due to high levels of noise, relatively poor soft-tissue contrast and artefacts typically present in clinical ultrasound images. Furthermore, image segmentation is challenging for the same reasons and therefore the use of many feature-based registration approaches is precluded for most clinical applications.
Several authors have investigated a hybrid registration technique, variously known as surface-to-image registration, feature-to-image registration, model-to-image registration, or model-to-pixel registration. In this approach, a geometric representation of the organs of interest is generated by segmenting a reference image to extract features, such as surface boundaries, tubular structures, etc, in the same way as traditional feature-based approaches. However, unlike the feature-based method, these features are matched directly to the pixel/voxel intensity values of a second image, which has not been segmented explicitly, but may have been processed in some way, for instance, to enhance certain features, such as boundaries. This process is normally achieved by minimising a mathematical cost function to determine a transformation that provides the best alignment between the features from the first image and the intensity values of the second image.
The most extensively investigated example of the above technique is the so-called active shape model developed by Cootes et al. 1995. In this method the geometric model is represented as a statistical shape model which deforms iteratively to fit to the boundary of an object in an unseen image. A closely related method is the so-called active appearance model, see Cootes et al. 1998 and Beichel et al. 2005. In this method, the statistical variation in the image intensity (or appearance) in the local region of the surface of a statistical shape model is included into the model at the training phase. This information is then used to match the shape model to an object boundary in an unseen image by maximising a measure of the similarity between the local intensity characteristics in the image around points on the deforming boundary and the corresponding intensity variation learnt by the active appearance model. One such measure is the sum-of-squared differences. Both active shape and active appearance models have been applied successfully to a wide range of image analysis problems in computer vision and medical imaging, particularly image classification, image segmentation, and image registration. However, both methods are known not to work well when the unseen image is corrupted in some way such that object boundaries are occluded or the intensity characteristics of the unseen image differ substantially from the images used to train the model. This situation is very common in medical image applications, particularly during image-guided interventions where (unseen) images obtained during an intervention are typically noisy, contain artefacts, and include medical instruments introduced into the patient. There are also many situations where, due to noise, artefacts and variability between patients, the variation in image intensity around points on the boundary of an object in a reasonably-sized set of training images is too wide for meaningful parametric statistical measures to be determined. In this case, the assumptions of the active appearance model method may break down.
Shao et al. 2006 describe one example of the above technique, which is used for aligning MR images of the pubic arch with US images obtained via a trans-rectal ultrasound (TRUS) probe. This technique involves manually identifying a bone surface in an MR image. A rigid transformation is then identified to align this surface with the US image, based on image properties such as regions of high intensity or the image intensity gradient.
Aylward et al. 2003 describe a model-to-image method for the registration and analysis of vascular images. The method includes using centre-line tracking to build a model of a vascular network from a first image, such as an MR image. This model is then subjected to a rigid transformation to align the model with a second image, such as an US image, on the assumption that centre-line points in the model correspond to bright lines in the image. Aylward et al. go on to investigate the impact of non-rigid deformations on this approach.
Wu et al. 2003 describe a model-to-pixel registration approach for prostate biopsy. The authors use a genetic algorithm (GA) that operates on a statistical model of the prostate boundary to evolve a population of 2D boundaries for prostate that are then matched to a gradient map from a US image. Each candidate (individual) in the GA corresponds to a specific rigid-body transformation and the better the match with the US gradient image, the higher the fitness of that individual. It is contemplated that the individuals could also include parameters to permit deformation (non-rigid transformation), or alternatively such deformation could be added as a final step onto the best-fit rigid registration.
King et al. 2001 describe the registration of preoperative MR or CT images with an intraoperative US image for liver treatment. A statistical shape model is derived by segmenting multiple MR scans and determining a mean surface shape and modes of variation. The modes of variation are then restricted to a single parameter representative of changes caused by the breathing cycle. This model was then registered to the US image by way of (i) a rigid transformation, and (ii) a non-rigid transformation representative of organ deformation due to breathing. A probabilistic (Bayesian) model is used to perform this registration based on summing the image intensity over the (transformed) model surface.
Other approaches to US-based registration have been proposed, see especially Roche et al., 2001; Penney et al., 2004/2006; Zhang et al. 2007; and Wein et al. 2008. However, to date these have been demonstrated only for a few organs and for specialised applications, and rely on automatically converting at least one of the images into a form that is more amenable to performing a registration using established intensity-based methods. However, this conversion step is not trivial in many circumstances, and these alternative approaches have yet to be demonstrated for many medically significant applications, such as image-guided needle biopsy of the prostate gland and image-guided surgical interventions for the treatment of prostate cancer.
US 2003/015611 describes geometric models which are represented using medial atoms—a so-called “medial representation” or “m-rep”. A method is described for registering an m-rep to an image by numerically optimising a local grey level intensity-based similarity measure, computed in the region of the m-rep surface.
WO 2009/052497, also specific to m-reps, describes a method for non-rigidly registering an m-rep model of an organ, derived from one image, to a second image. As discussed above, a typical scenario is when the model is derived from an image used for planning a surgical intervention, whereas the second (target) image is acquired during that intervention and the organ of interest has deformed between the times when the images were acquired. Finite element modelling is used to predict soft-tissue deformation and, more specifically, to provide training data for a statistical shape model. The model-to-image method is based on active appearance modelling as outlined above. Principal component analysis is applied to represent the statistical variation in image intensity in the local region of a model boundary in a linear form and, as in classical active appearance models, this information is then used to fit the model surface to the target image. However, this approach assumes that the intensity variation at corresponding locations across different training images adopts a Gaussian distribution, which may not be the case, particularly for interventional images.
Various computational models of organ motion for medical image registration have been proposed. For example, WO 2003/107275 describes the use of physiological models of organ motion due to respiration and cardiac motion to predict deformation between organs in two images that are subsequently registered non-rigidly, with a focus on the problem of registering PET and CT images. The motion models considered are based on deforming non-uniform rational B-spline (NURB) representations of organ surfaces and are not statistical in nature. The geometric model is created by segmenting both of the images to be registered, which is potentially problematic for surgical applications.
WO/2007/133932 discloses a method for the deformable registration of medical images for radiation therapy. Again, all input images must be segmented. In this approach, landmarks are identified in the images prior to registration (rather than performing a direct model-to-image registration).
A more general deformable image registration method is disclosed in WO 2008/041125, in which variations in the non-rigid behaviour of different parts of an image (for example, corresponding to different tissue types or mechanical discontinuities between tissue boundaries) may be accounted for by spatially varying the “flexibility” and/or non-Gaussian smoothing applied during registration.
Prostate cancer is a major international health problem, particularly affecting men in the Western World. Traditional treatment strategies involve either radical treatment of the whole gland—for example, by surgical excision or using radiotherapy—or pursuing an active surveillance/watchful waiting programme in which intervention is delayed in favour of monitoring the patient for signs of disease progression. Alternative minimally-invasive interventions for prostate cancer, such as brachytherapy, cryotherapy, high-intensity focused US, radiofrequency ablation, and photodynamic therapy are also now available, but the clinical efficacy of most of these treatment approaches has yet to be fully established through randomised controlled trials.
Up to 70% of patients treated for prostate cancer experience long term side-effects—principally sexual dysfunction and incontinence—caused by damaging the bladder, rectum, and/or the neurovascular bundles. Motivated by the potential for a reduced risk of side-effects compared with conventional treatments, there has recently been growing interest in techniques which enable the targeted treatment of prostate cancer in an effort to minimise damage to vulnerable structures, Ahmed et al. 2008. This had lead to interest in alternative treatment strategies, such as ‘focal therapy’, in which small volumes of the prostate (rather than the whole gland) are treated. It is anticipated by its clinical proponents that this will lead to a significant reduction in side-effects without compromising the therapeutic benefits of the treatment. Treatment costs should also be reduced as treatment times and hospital stays are much shorter. However, such targeted treatment approaches rely on accurate 3D mapping of cancer based on histological analysis of tissue samples obtained using needle biopsy and MR imaging.
Trans-rectal ultrasound (TRUS) imaging remains the most accessible and practical means for guiding needle biopsy and therapeutic interventions for prostate treatment. However, conventional (so-called ‘B-mode’) TRUS imaging is two-dimensional and typically provides very limited information on the spatial location of tumours due to the poor contrast of tumours with respect to normal prostatic tissue. Although there is some evidence that the use of microbubble contrast agents can improve the specificity and sensitivity of tumour detection, this method is not widely used and performing accurate, targeted biopsy and therapy using TRUS guidance alone is difficult in practice, particularly for the inexperienced practitioner. An alternative approach is to use preoperative MR images, which are registered to the TRUS images during a procedure, in order to accurately target tumours. Indeed, recent advances in functional and structural MR imaging techniques for localising and characterising prostate cancer have led to sensitivities and specificities that are now sufficiently high to be clinically useful for targeting localised therapy, Kirkham et al. 2006. However, the ability to accurately fuse anatomical and pathological information on tumour location, derived from MR images or a previous biopsy procedure, with TRUS images obtained during a procedure remains a significant technical challenge, mainly due to the differences in intensity between MR and TRUS images, which frustrate standard registration methods, as well as the significant deformation that occurs between the imaging sessions.
Morgan et al. 2007 describe various techniques for the registration of pre-procedure MR images to intra-procedure US images, especially for guiding minimally-invasive prostrate interventions. One technique is based on a form of feature registration, in which for both the MR and US image data, contours of the capsule surface of the prostrate are manually drawn on a series of slices of the US image, and the apex and base points, which correspond to the entrance and exit of the urethra at the ends of the prostrate gland, are manually identified. An image registration is then performed by finding a rigid transformation that minimises the cost of mapping from the apex points and the mid-band surface (as represented by a set of points on the surface) from one image to the apex points and mid-band surface of the other image.
Because of the long time required for contouring the US image during a surgical procedure, Morgan et al. also utilise a gradient-based, feature-to-image registration procedure. Using this method, an MR image is first segmented to extract the capsule surface of the prostate gland. Registration is performed by aligning MR surface normal vectors with gradient vectors of the TRUS image, calculated using Gaussian derivative filters, such that a cost function is minimised. However, this approach was found not to produce such accurate image registration, especially if the prostate gland has deformed significantly between the MR and US images. Much of this deformation is caused by the presence of the TRUS probe, which is always inserted into the rectum during US imaging, or an endorectal coil, which is sometimes used during MR imaging.
WO 00/14668 describes the construction of a 3D probability map of prostate cancer location, based on an analysis of computer reconstructions of excised prostate gland specimens. One intended use of these models is to direct ultrasound-guided prostate biopsy to maximise the probability of detecting cancer. To achieve this, registration of a geometric model containing the probability map to ultrasound images acquired during biopsy is required. A feature-based registration method is proposed, which requires segmentation of the prostate gland in the target, i.e. ultrasound, image to provide a patient-specific target model to which the (generic) probabilistic model is then registered by fitting the model surfaces.
WO 2008/122056 discloses an image-based method for the delivery of photodynamic therapy (PDT) for the treatment of prostate cancer and uses deformable registration of two images to deliver, monitor, and evaluate PDT. The registration method involves non-rigidly registering organ surfaces, segmented from each image, and using a finite element model or thin-plate spline model to interpolate the tissue displacement inside the organ. In the case of the finite element model, the displacement of the surface is used to set the boundary conditions for a finite element simulation given assumed mechanical properties for tissue. Again, this approach requires prior segmentation of both input images.
U.S. Pat. No. 5,810,007 discloses a method for registering ultrasound and x-ray images of the prostate for radiation therapy. This method requires the implantation of spherical fiducial markers to act as landmarks, which are subsequently rigidly aligned.
In a recent paper, Xu et al. (2008) state: “Currently, there is no fully automatic algorithm that is sufficiently robust for MRI/TRUS [Transrectal Ultrasound] image registration of the prostate”.