The full scientific, medical and educational value of multidimensional, multimodality imaging remains largely unexplored, and has been impeded primarily by inadequate capabilities for accurate and reproducible registration and segmentation of multiple 2-D or 3-D images. With regard to registration, 2-D tomographic images taken at different times cannot be guaranteed to represent the same spatial section of the patient, and the 2-D images themselves do not contain the information required to either measure or correct the misregistration. The use of 3-D images makes full six-degree-of-freedom registration possible not only for images taken at different times, but for images from different modalities.
A general approach to registration of multiple images has three steps: 1) defining corresponding features between the different image data sets, 2) finding the matching transformation, and 3) transforming one (or more) of the images to bring it (them) into spatial registration with another. External and internal anatomical landmarks have been used by several researchers as matching features within multimodal images (Evans). Accurately locating these landmarks across modalities is an inherently manual process, presenting significant difficulty even to highly trained experts. Fiducial markers introduced at scanning time have been used for multimodal image registration with some success, but this technique requires the use of an immobilizing frame which is not removed between scans, and is not applicable to serial studies. Complex moments calculated for each image data set can be used as matching features, but exactly corresponding subvolumes must be defined. Matching of surfaces extracted from common objects (Pelizzari) shows greater potential for automated registration of serial as well as multimodal images, but many proposed techniques can only accurately measure misregistration in a limited set of geometrically-tractable anatomical situations. Most suffer the classical problems associated with a global minimum search and usually require manual intervention.
In searching for the matching transformation, rigid body motion is often assumed in order to simplify the process. A rigid body may be defined in classical mechanics as a system of mass points subject to the holonomic constraints that the distances between all pairs of points remain constant throughout the motion. It allows three translational and three rotational degrees of freedom and, knowing these parameters, any rigid body motion can be restored. In order to find the ideal transformation, the images must be normalized to the same size. If the voxel dimensions are not known, then this normalization factor must be determined as part of the transformation process. The application of the best transformation to one image will often involve interpolation of data values. Generally, trilinear interpolation will give a satisfactory result. Non-linear gray level interpolation may be desired if the original image has a large inter-slice distance.
The process of forming an image involves the mapping of an object, and/or some property of an object, into or onto what may be called "image space." This space is used to visualize the object and its properties, and may be used to quantitatively characterize its structure and/or its function. Imaging science may be defined as the study of these mappings and development of ways to better understand them, to improve them and to productively use them. An important challenge of imaging science is to use multi-spectral image measurements (i.e. multimodality images) to provide advanced capabilities for visualization and quantitative analysis of biomedical images in order to significantly improve faithful extraction of both the scientific and clinical information which they contain, and thereby increase their usefulness in basic science, medical diagnosis and clinical treatment.
The spatial registration of medical images obtained from several modalities, such as positron emission tomography (PET), single photon emission computed tomography (SPECT), nuclear magnetic resonance (MR) images, computer-assisted tomography (CT) and ultrasound (US), has become clinically important. The revolutionary capabilities provided by these 3-D and even 4-D medical imaging modalities now allow direct visualization and study of structure and function of internal organs in situ. However, the ability to extract objective and quantitatively accurate information from 3-D biomedical images has not kept pace with the ability to produce the images themselves. The challenge is to provide advanced interactive and quantitative methods to visualize, extract and measure the intrinsic and relevant information contained in the data produced by these imaging systems; that is, the true morphologic, pathologic, biologic, physiologic, and/or metabolic "meaning" of the numbers. Capabilities to accurately synthesize or "fuse" the complementary image data produced by these multiple modalities into composite image sets could greatly facilitate realization of this potential.