1. Field of the Invention
The invention is directed to an imaging apparatus and methods for performing assessment and monitoring with interpreted imaging. Embodiments of the invention are particularly useful in surgery, clinical procedures, tissue assessment, diagnostic procedures, health monitoring, and medical evaluations.
2. Description of the Background
Spectroscopy, whether it is visible, near infrared, infrared or Raman, is an enormously powerful tool for the analysis of biomedical samples. The medical community, however, has a definite preference for imaging methods, as exemplified by methods such as MRI and CT scanning as well as standard X-ray photography and ultrasound imaging. This is entirely understandable as many factors need to be taken into account for a physician to make a clinical diagnosis. Imaging methods potentially can provide far more information to a physician than their non-imaging counterparts. With this medical reality in mind, there has been considerable effort put into combining the power and versatility of imaging method with the specificity of spectroscopic methods.
Near-infrared (near-IR) spectroscopy and spectroscopic imaging can measure the balance between oxygen delivery and tissue oxygen utilization by monitoring the hemoglobin oxygen saturation in tissues (Sowa, M. G. et al., 1998, Proc. SPIE 3252, pp. 199-207; Sowa, G. W. et al., 1999, Journal of Surgical Research, 86:62-29; Sow, G. W. et al., 1999, Journal of Biomedical Optics, 4:474-481; Mansfield, J. R., et al., 2000, International Society of Optical Engineers, 3920:99-197). For in-vivo human studies, the forearm or leg has been the investigational site for many of the noninvasive near-IR studies. Non-imaging near-IR applications have examined the local response of tissue to manipulations of blood flow (De-Blasi, R. A. et al., 1992, Adv. Exp. Med. Biol, 317:771-777). Clinically, there are situations where the regional variations in oxygenation saturation are of interest (Stranc, M. F. et al, 1998, British Journal of Plastic Surgery, 51:210-218). Near-IR imaging offers a means of accessing the spatial heterogeneity of the hemoglobin oxygenation saturation response to tissue perfusion. (Mansfield, J. R. et al., 1997, Analytical Chemistry, 69:3370-3374; Mansfield, J. R., et al., 1997, Computerized Medical Imaging and Graphics, 21:299-308; Salzer, R., et al., 2000, Fresenius Journal of Analytical Chemistry, 366:712-726; Shaw, R. A., et al., 2000, Journal of Molecular Structure (Theochem), 500:129-138; Shaw, R. A., et al., 2000, Journal of Inorganic Biochemistry, 79:285-293; Mansfield, J. R., et al., 1999, Proc. SPIE Int. Soc. Opt. Eng., 3597:222-233; Mansfield, J. R., et al., 1999, Applied Spectroscopy, 53:1323-1330; McIntosh, L. M., et al., 1999, Biospectroscopy, 5:265-275; Mansfield, R., et al., Vibrational Spectroscopy, 19:33-45; Payette, J. R., et al., 1999, American Clinical Laboratory, 18:4-6; Mansfield, J. R., et al., 1998, IEEE Transactions on Medical Imaging, 6:1011-1018
Non-invasive monitoring of hemoglobin oxygenation exploits the differential absorption of HbO2 and Hb, along with the fact that near-IR radiation can penetrate relatively deeply into tissues. Pulse oximetry routinely supplies a noninvasive measure of arterial hemoglobin oxygenation based on the differential red-visible and near infrared absorption of Hb and HbO2. Visible/near-IR multispectral imaging permits the regional variations in tissue perfusion to be mapped on macro and micro scale. Unlike infrared thermography, hyperspectral imaging alone does not map the thermal emission of the tissues. Instead, this imaging method relies on the differential absorption of light by a chromophore, such as, Hb and HbO2, resulting in differences in the wavelength dependence of the tissue reflectance depending on the hemoglobin oxygen saturation of the tissue. (Sowa, M. G., et al., 1997, Applied Spectroscopy, 51:143-152; Leventon, M., 2000, MIT Ph.D. Thesis)
Spectroscopic imaging methodologies and data are becoming increasingly common in analytical laboratories, whether it be magnetic resonance (MRI), mid-IR, Raman, fluorescence and optical microscopy, or near-IR/visible-based imaging. However, the volume of information contained in spectroscopic images can make standard data processing techniques cumbersome. Furthermore, there are few techniques that can demarcate which regions of a spectroscopic image contain similar spectra without a priori knowledge of either the spectral data or the sample""s composition. The objective of analyzing spectroscopic images is not only to determine what the spectrum is at any particular pixel in the sample, but also to determine which regions of the sample contain similar spectra; i.e., what regions of the sample contain chemically related compounds. Multivariate analysis methodologies can be used to determine both the spectral and spatial characteristics of a sample within a spectroscopic imaging data set. These techniques can also be used to analyze variations in the temporal shape of a time series of images either derived for extracted from a time series of spectroscopic images.
There are few techniques that can demarcate which regions of a sample contain similar substances without a priori knowledge of the sample""s composition. Spectroscopic imaging provides the specificity of spectroscopy while at the same time relaying spatial information by providing images of the sample that convey some chemical meaning. Usually the objective in analyzing heterogeneous systems is to identify not only the components present in the system, but their spatial distribution. The true power of this technique relative to traditional imaging methods lies in its inherent multivariate nature. Spatial relationships among many parameters can be assessed simultaneously. Thus, the chemical heterogeneity or regional similarity within a sample is captured in a high dimensional representation which can be projected onto a number of meaningful low dimensional easily interpretable representations which typically comprise a set of composite images each having a specific meaning.
While it is now clear that both spectroscopy and spectroscopic imaging can play roles in providing medically relevant information, the raw spectral or imaging measurement seldom reveals directly the property of clinical interest. For example using spectroscopy, one cannot easily determine whether the tissue is cancerous, or determine blood glucose concentrations and the adequacy of tissue perfusion. Instead, pattern recognition algorithms, clustering methods, regression and other theoretical methods provide the means to distill diagnostic information from the original analytical measurements.
There are however various methods for the collection of spectroscopic images. In all such cases, the result of a spectroscopic imaging experiment is something termed a spectral image cube, spectroscopic imaging data cube or just hypercube. This is a three dimensional array of data, consisting of two spatial dimensions (the imaging component), and one spectral dimension. It can be thought of as an array of spatially resolved individual spectra, with every pixel in the first image consisting of an entire spectrum, or as a series of spectrally resolved images. In either representation, the 3D data cube can be treated as a single entity containing enormous amounts of spatial and spectral information about the sample from which it was acquired.
As an extension of the three dimensional array acquired in a spectroscopic imaging experiment, one can collect data cubes as a function of additional parameters such as time, temperature or pH. Numerous algorithms can be used to analyze these multi-dimensional data sets so that chemical and spectral variations can be studied as additional parameters. However, taken together, they can allow one to more fully understand the variations in the data. This can be done in a gated or sequential fashion.
Multi-modal image fusion, or image registration, is an important problem frequently addressed in medical image analysis. Registration is the process of aligning data that arise from different sources into one consistent coordinate frame. For example, various tissues appear more clearly in different types of imaging methods. Soft tissue, for example, is imaged well in MR scans, while bone is more easily discernible in CT scans. Blood vessels are often highlighted better in an MR angiogram than in a standard MR scan. Multiple scans of the same patient will generally be unregistered when acquired, as the patient may be in different positions in each scanner, and each scanner has its own coordinate system. In order to fuse the information from all scans into one coherent frame, the scans must be registered. The very reason why multiple scans are useful is what makes the registration process difficult. As each modality images tissue differently and has its own artifacts and noise characteristics, accurately modeling the intensity relationship between the scans, and subsequently aligning them, is difficult.
The registration of two images consists of finding the transformation that best maps one image into the other. If I1 and I2 are two images of the same tissue and T is the correct transformation, then the voxel I1(x) corresponds to the same position in the sample as the voxel I2(T(x)). In the simplest case, T is a rigid transformation consisting of three degrees of freedom of rotation and three degrees of freedom of translation. The need for rigid registration arises primarily from the patient being in different positions in the scanning devices used to image the anatomy. The information from all the images is best used when presented in one unified coordinate system. Without such image fusion, the clinician must mentally relate the information from the disparate coordinate frames.
One method of aligning the two images is to define an intermediate, patient-centered coordinate system, instead of trying to directly register the images to one another. An example of a patient-centered reference frame is the use of fiducial markers attached to a patient throughout the various image acquisitions. The fiducial markers define a coordinate system specific to the patient, independent of the scanner or choice of imaging modality. If the markers remain fixed and can be accurately localized in all the images, then the volumes can be registered by computing the best alignment of the corresponding fiducials (Horn, B. K. P., 1987, Journal of the Optical Society of America A, 4:629-642; Mandava, V. R., et al., Proc SPIE, 1992, 1652:271-282; Haralick, R. M., et al., 1993, Computer and Robot Vision). The main drawback of this method is that the markers must remain attached to the patient at the same positions throughout all image acquisitions. For applications such as change detection over months or years, this registration method is not suitable. Fiducial registration is typically used as ground-truth to evaluate the accuracy of other methods as careful placement and localization of the markers can provide very accurate alignment (West, J. et al., 1996, Proc SPIE, Newport Beach, Calif.).
When fiducial markers are not available to define the patient coordinate frame, corresponding anatomical feature points can be extracted from the images and used to compute the best alignment (Maintz, J. B. Antione, et al., 1995 Computer Vision, Virtual Reality and Robotics in Medicine, pp. 219-228; Maguire, Jr., G., et al., 1991, IEEE Computer Graphics Applications, 11:20-29). This approach depends greatly on the ability to automatically and accurately extract reliable image features. In general, methods of feature extraction such as intensity thresholding or edge detection do not work well on medical scans, due to non-linear gain fields and highly textured structures. Even manual identification of corresponding 3D anatomical points can be uireliable. Without the ability to accurately localize corresponding features in the images, alignment in this manner is difficult.
Instead of localizing feature points in the images, richer structures such as object surfaces can be extracted and used as a basis of registration. A common method of registering MR and CT of the head involves extracting the skin (or skull) surfaces from both images, and aligning the 3D head models (Jiang, H., et al., 1992 Proc. SPIE, 1808:196-213; Lemoine, D. et al., 1994, Proc. SPIE, 2164:46-56). For PET/MR registration, the brain surface is typically used since the skull is not clearly visible in PET (Pelizzari, C., et al., J Comput Assist. Tomogr., 1989, 13:20-26). The 3D models are then rigidly registered using surface-based registration techniques (Ettinger, G., 1997, MIT Ph.D Thesis). The success of such methods relies on the structures being accurately and consistently segmented across modalities and the surfaces having rich enough structure to be unambiguously registered.
Voxel-based approaches to registration do not extract any features from the images, but use the intensities themselves to register the two images. Such approaches model the relationships between intensities of the two images when they are registered, and then search through the transformation space to find an alignment that best agrees with the model. Various intensity models are discussed, including correlation, mutual information, and joint intensity priors.
Correlation is a measure commonly used to compare two images or regions of images for computer vision problems such as alignment or matching. Given the intensity values of two image patches stacked in the vectors u and v, the normalized correlation measure is the dot product of unit vectors in the directions of u and v:
(uxc2x7v)/(∥u∥∥v∥)
An advantage of correlation-based methods is that they can be computed quite efficiently using convolution operators. Correlation is applicable when one expects a linear relationship between the intensities in the two images. In computer vision problems, normalized correlation provides some amount of robustness to lighting variation over a measure such as sum of square differences (SSD), ∥uxe2x88x92v∥2. The primary reason for acquiring more than one medical scan of a patient stems from the fact that each scan provides different information to the clinician. Therefore, two images that have a simple linear intensity relationship may be straightforward to register, but do not provide any additional information than one scan by itself. On the other hand, if the images are completely independent (e.g. no intensity relationship exists between them), then they cannot be registered using voxel-based methods. In general, there is some dependence between images of different modalities and each modality does provide additional information.
One simplified model of the medical imaging process is that an internal image is a rendering function R of underlying tissue properties, P(x), over positions x. An image of modality A could be represented as a function RA(P) and a registered image of modality B of the same patient would be another function, say RB(P). Suppose a function F(x) could be computed relating the two rendering functions such that the following is true (with the possible addition of some Gaussian noise, N):
F(RB(P))=RA(P)+N
The function F would predict the intensity at a point in Image A given the intensity at the corresponding point in Image B. Such a function could be used to align a pair of images that are initially in different coordinate systems using SSD:
T*=argminTxcexa3x(F(RB(P(X)))xe2x88x92RA(P(x)))2
where T is the transformation between the two sets of image coordinates. Van den Elsen et al. compute such a mapping that makes a CT image appear more like an MR, and then register the images using correlation (van den Elsen, P., et al., 1994, xe2x80x9cVisualization in Biomedical Computing,xe2x80x9d 1994 Proc SPIE, 2359:227-237). In general, explicitly computing the function F that relates two imaging modalities is difficult and under-constrained.
Maximization of mutual information (MI) is a general approach applicable to a wide range of multi-modality registration applications (Bell, A. J., et al., 1995 Advances in Neural Information Processing 7; Collignon, D., et al., 1995, First Conf. on Computer Vision, Virtual Reality and Robotics in Medicine Springer; Maes, F. et al, 1996, Mathematical Methods in Biomedical Image Analysis; Wells, W. M., et al., 1996, Medical Image Analysis, 1(1):35-51). One of the strengths of using mutual information is that MI does not use any prior information about the relationship between joint intensity distributions. While mutual information does not explicitly model the function F that relates the two imaging modalities, it assumes that when the images are aligned, each image should explain the other better than when the images are not aligned.
Given two random variables U and V, mutual information is defined as (Bell, 1995):
MI(U,V)=H(U)+H(V)xe2x88x92H(U,V)
where H(U) and H(V) are the entropies of the two variables, and H(U,V) is the joint entropy. The entropy of a discrete random variable is defined as:
H(U)=xe2x88x92xcexa3Pu(u) log Pu(u)
where Pu(u) is the probability mass function associated with U. Similarly, the expression for joint entropy entropy operates over the joint PDF:
H(U,V)=xe2x88x92xcexa3xcexa3Pu,v(u,v) log Pu,v(u,v)
When U and V are independent, H(U,V)=H(U)+H(V), which implies the mutual information is zero. When there is a one-to-one functional relationship between U and V, (i.e. they are completely dependent), the mutual information is maximized as:
MI(U,V)=H(U)=H(V)=H(U,V)
To operate on images over a transformation, we consider the two images, I1(x) and I2(x) to be random variables under a spatial parameterization, x. We seek to find the value of the transformation T that maximizes the mutual information (Wells, 1996):
T*=argmaxTMI(I1(x),I2(T(x)))
T*=argmaxTH(I1(x))+H(I2(T(x)))xe2x88x92H(I1(x),I2(T(x))
The entropies of the two images encourage transformations that project I1 onto complex parts of I2. The third term, the (negative) joint entropy of I1 and I2, takes on large values when X explains Y well. Derivatives of the entropies with respect to the pose parameters can be calculated and used to perform stochastic gradient ascent (Wells, 1996). West et al. compare many multi-modal registration techniques and find mutual information to be one of the most accurate across all pairs of modalities (West, 1996).
Leventon et al. introduced an approach to multi-modal registration using statistical models derived from a training set of images (Leventon, M., et al., 1998, Medical Image Computing and Computer-assisted Intervention). The method involved building a prior model of the intensity relationship between the two scans being registered. The method requires a pair of registered training images of the same modalities as those to be registered in order to build the joint intensity model. To align a novel pair of images, the likelihood of the two images given a certain pose based on our model by sampling the intensities at corresponding points is computed. This current hypothesis can be improved by ascending the log likelihood function. In essence, one computes a probabilistic estimate of the function F (that relates the two imaging modalities) based on intensity co-occurrence. To align the novel images, the pose is found that maximizes the likelihood that those images arose from the same relation F.
Building a joint-intensity model does require having access to a registered pair of images of the same modality and approximately the same coverage as the novel pair to be registered. Mutual information approaches do not need to draw upon previously registered scans. However, when this information is available, the prior joint intensity model provides the registration algorithm with additional guidance which results in convergence on the correct alignment more quickly, more reliably and from more remote initial starting points.
The present invention overcomes problems and disadvantages associated with current strategies and designs and provides methods and apparatus for imaging using real-time or near real-time assessment and monitoring. Embodiments of the device are useful in a plurality of settings including surgery, clinical procedures, tissue assessment, diagnostic procedures, forensic, health monitoring and medical evaluations.
One embodiment of the invention is directed to an imaging apparatus comprising integrating spatial, spectral and temporal features, and optionally other physiologic or relevant data, such as room temperature or ambient light, in a spectral and temporal multimodal imaging system for the evaluation of biological systems and stimuli and fusing one or more thermal images or other imaging modalities and hyperspectral data cube for assessment of biological processes. The integrated features may comprise two or more of visible or infrared hyperspectral images, visible or infrared brightfield images, thermal images, fluorescence images, Raman images and/or other relevant imaging modalities. The imaging apparatus may further comprise a specific UV, visible and/or infrared light source, and means for collecting two or more of visible or infrared hyperspectral images, visible or infrared brightfield images, thermal images, fluorescence images, Raman images, or standard video images.
Another embodiment of the invention is directed to methods for detecting a diseased condition comprising acquiring thermal images from a target, acquiring visible or infrared hyperspectral images from the same target, fusing the thermal images and visible or infrared hyperspectral images to analyze spatial distributions and/or feature determination of the target. Thermal images or hyperspectral images of the target and/or other data can be interlaced with a time dependent reference to determine changes which could influence and be correlated with results from other imaging modalities. Wavelengths can be selected to maximize diagnostic information for a specific tissue state or anticipated end diagnostic goal. The selection step involves performing multivariate image and spectral processing using multivariate image and spectral processing algorithms to extract information from the plurality of images and spectra for real-time or near real-time assessment. Multiple hyperspectral collection devices in a variety of wavelength regimens could be used simultaneously or sequentially or on an as needed basis. For instance a visible hyperspectral images could be combined with a near infrared hyperspectral imager (plus or minus a broad band thermal camera) to provide combined information from both wavelength regions. In this way, one can analyze tissue health mapping; skin sebum level mapping; skin dryness, skin texture, skin feel or skin color mapping; skin damage detection and mapping (UV damage, frostbite, bums, cuts, abrasions) impact of cosmetics or other substances applied to the skin bruise age, force of impact, peripheral vascular disease diagnosis, extent, determination or regionalization of ischemia, varicose veins or hemorrhage detection, local detection and mapping, systemic infection detection, differentiation between viral, bacterial and fungal, and more specific identification, such as between gram negative and gram positive bacterial infection, venous occlusion increase in total hemoglobin, hematocrit, and change in deoxyhemoglobin/oxyhemoglobin ratio, differentiate between ischemia and hypoxia, bum depth and wound healing evaluation, non-invasive diagnosis of shock by imaging uninjured skin, hemorrhagic shock, septic shock, bum shock, changes in a dynamic system as a function of time or other parameter, vascular occlusion, vaso-dilation and vaso-constriction changes related to the presence of cancer in primary tissue or lymph nodes, either surface or subsurface, changes related to a specific chemical, mechanical, thermal, pharmacological or physiological stimulus. Different levels of microvascular constriction and relaxation lead to different ratios of oxyhemoglobinldeoxyhemoglobin, to tissue perfusion, tissue abnormality, disease state or diagnostic condition, total hematocrit, differentiate differences in reperfusion state following occlusion where oxygenation levels may remain low although there is good perfusion.