The field of the invention is nuclear magnetic resonance imaging methods and systems. More particularly, the invention relates to a system and method for tracking physical changes in a series of medical images of a given patient over time.
When a substance such as human tissue is subjected to a uniform magnetic field (polarizing field B0), the individual magnetic moments of the spins in the tissue attempt to align with this polarizing field, but process about it in random order at their characteristic Larmor frequency. If the substance, or tissue, is subjected to a magnetic field (excitation field B1) that is in the x-y plane and that is near the Larmor frequency, the net aligned moment, Mz, may be rotated, or “tipped”, into the x-y plane to produce a net transverse magnetic moment Mt. A signal is emitted by the excited spins after the excitation signal B1 is terminated, this signal may be received and processed to form an image.
When utilizing these signals to produce images, magnetic field gradients (Gx, Gy and Gz) are employed. Typically, the region to be imaged is scanned by a sequence of measurement cycles in which these gradients vary according to the particular localization method being used. The resulting set of received nuclear magnetic resonance (NMR) signals are digitized and processed to reconstruct the image using one of many well known reconstruction techniques.
A magnetic resonance imaging (MRI) system may be used to acquire many types of images from a particular anatomical structure, for example, the brain. Such images employ contrast mechanisms that enable different brain tissues and lesions to be identified. The usual practice is to acquire a set of MRI images and then manually segment different tissues and lesions and provide sample points of the different tissues for use in subsequent processing. Only limited attempts have been made to implement automated sample point generation algorithms. Some classification algorithms provide very rudimentary automated sample points generation, with little use of domain specific knowledge (e.g., fuzzy k-means), where the user supplies the number of pure tissues and the algorithm attempts to locate the cluster centroids by examining only the image intensities.
Atlas based segmentation methods can be used as one method for the automatic generation of samples, and there have been examples of such implementations in the literature (e.g., Linguraru et al., 2006). However, by definition these atlas based algorithms make assumptions regarding anatomy, which may not be true considering pathology and surgical interventions. One important motivation for the performance of anatomical magnetic resonance imaging and image analysis is the examination of pathology (e.g. tumors, multiple sclerosis, etc.). In such cases, pathology may be quite wide-spread, and may present in a large variety of ways, causing deviations from “normal” anatomy. Furthermore, resections may be performed in such cases, as well, causing the particular patient's brain to deviate even further from the atlas-based norms.
Classification of lesions, per se, is not new, and many solutions to classify multi-spectral data exist, both within and outside of the domain of medical imaging. Some classification methods are supervised and require that the user supply sample data. Some examples of supervised classification algorithms are thresholding, Euclidean distance, Mahalanobis distance, K nearest neighbor, Bayesian, and neural network. Other classification algorithms are unsupervised and require no sample data. Some examples of unsupervised classification algorithms include chain method, and k-means. It should be noted, however, that unsupervised classification algorithms typically require some rudimentary data be supplied by the user. For example, the chain method requires that a threshold be supplied, and k-means requires that the actual number of clusters be supplied. Some classification algorithms categorize each data point, that is, each voxel in the case of image data, into one of a number of discrete categories. Other classification algorithms allow partial membership in multiple categories. Still, other classification algorithms allow partial membership in multiple classes initially, but then defuzzify the membership data into discrete categories in a later step.
A large fraction of existing classification algorithms are devoted to correctly assigning voxels into discrete categories. In fact, many authors write explicitly that the purpose of classification is to assign multispectral data points into discrete categories and some then add as an after-thought that the assignment may also be made fuzzy. Furthermore, even the algorithms that purport to possess fuzzy classification capability, still have a stated purpose of achieving the highest accuracy with voxels that do in fact belong purely to one category or another, such as Mahalanobis distance based classifiers. These algorithms acknowledge that in their intended domain, the overwhelming majority of voxels will, in fact, belong purely to one category or another and; therefore, in order to achieve the highest average accuracy, these methods use membership functions that achieve the highest accuracy for voxels that are in fact purely one category. However, this is achieved at the expense of accuracy for voxels that do in fact possess mixed membership.
In the case of lesions of the white matter of the brain, the voxels that, in fact, possess partial membership are the voxels that are of greatest interest. There are a variety of reasons for an algorithm to be developed that is focused on accurately quantifying partial membership and partial character, rather than categorizing voxels into discrete categories. For example, in the case of pathology (lesion and enhancing lesion), voxels are rarely fully abnormal. Rather, in such cases, abnormality exists in degree. It is subtly abnormal voxels for which computational assistance might find a high degree of usefulness, because these subtly abnormal voxels are relatively difficult to see with the unaided eye. Furthermore, accurate quantification of these partial volume and partial membership characteristics is important, because it has been shown that, at least in tumors, the rate of growth is suggestive of prognosis. Thus, particularly for small lesions, the ability to accurately quantify these effects places a distinct theoretical lower bound on the size of the lesions from which prognosis may be quantified.
One group of authors created a classification procedure specifically geared for recognizing lesions that acknowledged the importance of preserving partial volume effects via a feature extraction step involving a linear combination of the original images (Hamid Soltanian-Zadeh, 1998). These authors did not, however, design their algorithm with the real behavior of lesion intensities in feature space in mind. Rather, the algorithm, which used an Eigenimage feature extraction step, was more tuned to the behavior of noise and contrast, than the lesions themselves. The algorithm required that the images demonstrate frank lesion and that manual sample points be supplied, which required an investment of time by the user and that introduced variability into the process. Furthermore, the authors used thresholding as their noise reduction scheme.
It has been recognized for a considerable period of time that it is challenging for a human observer to compare images side by side that are nearly the same, in order to identify the differences. This problem exists in a large number of fields of imaging, both within and outside of medicine. Some examples of non-medical change detection topics include remote sensing, astronomy, surveillance, geology, reconnaissance photography, and the like. The development of computer algorithms to compare serial images has been a fruitful field of study in a variety of areas. Probably the most work has been done in the field of remote sensing, where satellite imagery of the earth, using various kinds of sensors, has been used in an attempt to detect various kinds of changes. In this sub-discipline of change detection as in others, imagery is to be compared between image sets acquired at two time-points. The various sub-fields of change detection have important points in common. First, the motivation for change detection is generally the same in all cases. Side-by-side presentation of images for the purposes of identifying small differences is a sub-optimal method of display, because it requires the viewer to sequentially examine each area and each feature to be compared. Second, imaging systems, which include sensors, data storage facilities, and the like, are becoming more and more prevalent and more and more capable of storing vast amounts of data. This phenomena has been called “information overload” and there is little doubt that this trend will continue. With vast amounts of data being collected, it will be unreasonable or impossible for a human to make every imaginable comparison, but it is not at all unreasonable for this task to be performed by computer.
There are a variety of reasons that the comparison of serial images is challenging. One reason that the problem of change detection is difficult is that side-by-side presentation of images is poorly matched to our visual systems. That is, the particular workings of our visual systems means that observers much sequentially examine each feature of interest, and explicitly compare them. This is time consuming, particularly as the data sets and features of interest become large, and it additionally biases us against detecting changes that we do not expect, but which nonetheless may still be relevant and important. Another reason that change detection is challenging is that the disease related changes of interest are likely to be confounded with acquisition related changes, which are not of interest. Still another reason is that comparisons that might be desired are more complicated than simple comparison of the intensities in the images. With greater and greater complexity of the questions to be asked and the comparisons to be made, the task becomes more and more difficult for an unaided observer. Yet another reason the task of comparison of images is difficult is, as mentioned above, information overload. Information overload is a problem that will almost definitely increase as the sophistication of imaging hardware improves.
In its simplest form, change detection may be performed as subtraction of two serial images as shown in FIG. 1. This approach has been used with great effectiveness. More sophisticated approaches may be imagined and have been used, where some processing is performed on the images, to produce derivative data that is then compared as shown in FIG. 2. Spatial registration is a simple example of such processing, which is commonly applied in remote sensing, and medical image analysis. A large variety of image processing steps exist, and can be conceived of. These may require multiple steps in a sequence, for example, inhomogeneity correction followed by registration, followed by some form of image understanding, for example, classification in remote sensing or medical imaging, followed by comparison of the classified images. In the case of surveillance, where it could, for example, be desired to observe when a new person has entered the visual field of the camera, face recognition may be one of the steps. Typically, the more sophisticated processing steps apply domain specific knowledge. The domain specific knowledge, which is used to improve the performance of change detection algorithms and to attune the algorithm to the detection of specific kinds of changes of interest, may be very sophisticated and varies greatly from application to application.
However, the behavior of the sensors, the meaning of intensities, the subject being imaged, and the like vary greatly from domain to domain and, thus, while papers in the field of remote sensing may compare log-ratio images, these may not make any sense at all in the context of comparison of serial MRI brain studies. Likewise, even within brain MRI, the subject of change detection is made broad by the different kinds of domain-specific knowledge that may be applied, and the kind of derivative information that may be generated and compared from one acquisition to another. One set of analyses might be used to detect and characterize changes in white matter lesions, while a completely different set of analyses may be used to detect and characterize changes due to atrophy, for example, identification of discrete boundaries using finite element models with sub-voxel resolution, and the comparison of the position of these boundaries from one acquisition to another. In fact, there exists a large range of possible formulations of these post-acquisition processing steps and inter time-point comparisons.
Within the context of change detection in brain MRI of white matter pathology, there are a variety of causes of inter-observer variability. One is that there is presently no objective definition of stability or progression. Additionally, changes that are relatively subtle, either in extent or degree, may be difficult to see. This is especially true when the images are displayed side-by-side, not registered, with intensity inhomogeneities and the like. Manually identifying changes in such a context requires the observer to mentally deconfound the data, which is challenging for a human observer. It can be imagined that a computer might be very well suited to helping with this task, since, theoretically, computers have unlimited memory, which is in stark contrast with human visual memory and short-term memory, both of which are limited. Thus, computers can apply a theoretically unlimited number of intermediate processing steps. Changes might be spread across multiple pulse sequences, and they might be indefinite in one slice, but more definite in consideration of multiple slices. This, however, requires assimilation and integration of a number of slices, before reaching a decision. Such a process is not natively simple for the human visual system, at least when the images are displayed as slices. However, such a process is much simpler for a computer.
There are other “expectation-oriented” reasons that change detection is challenging. Changes that occur in unexpected locations or that are unexpected in terms of their character, may be missed. There are a variety of other issues that might cause a neuroradiologist to miss changes, such as “satisfaction of search,” where a radiologist stops looking when they find imaging features that explain the symptoms motivating the scan in the first place; however, there may still be other important findings in the images. Sometimes, “information overload” makes change detection difficult, due simply to the shear volume of data presented. This problem will almost definitely increase as scanners produce greater and greater quantities of data. In a general sense, computers excel at methodically wading through large amounts of data, looking for needles in hay-stacks, integrating large amounts of data at once, applying serial processing steps, analyzing data mathematically, and bringing noteworthy observations to the attention of the neuroradiologist.
The detection of regions of signal embedded in zero-mean background noise is a recurring problem in various fields of medical imaging, including classification, fMRI, and change detection. The detection of such signals is also an important problem outside of medical imaging in such fields as remote sensing, surveillance, astronomy, and geology, to name a few. A variety of approaches have been taken to identify regions of signal indicating a desired objects embedded in regions of noise in imagery data. By far the most common method that has been used is simple thresholding, where pixels or voxels possessing image intensities lower than some threshold value are set to 0, while image intensities larger than the threshold value are retained. That is, image intensities below the threshold are identified as pixels or voxels are considered to not belong to the object and those above the threshold are considered to belong to the object. Thresholding is limited in that it does not allow simultaneous rejection of noise concurrent with retention of low intensity object voxels. One attempt to approach this limitation is commonly used in fMRI circles. It involves the use of a “smearing” filter on the image data prior to thresholding. Essentially, this approach has two effects. First, it smears the voxel intensities of regions containing actual objects into their neighbors. If the regions of actual objects contain some high intensity voxels, these high intensities can reinforce their low intensity neighbors, which then become included in the region retained by thresholding. The second effect of these smearing filters is for regions not containing actual objects, which presumably contain zero-mean noise, to diminish in intensity. In many cases, this increases the degree to which noisy regions are rejected by the process of thresholding. Although popular in some circles, this approach is in many ways intuitively unsatisfactory. This is primarily because it works most effectively when the actual object possesses high intensity, which is certainly not always the case, rather, it is very frequently desired to detect subtle objects. This is also because the method virtually guarantees the erroneous detection of a penumbra of noise voxels around an intense object. Furthermore, such a method can result in false negatives. That is, if dark voxels around a moderately bright object are smeared into the moderately bright object, the intensities of the voxels making up the moderately bright object to drop below the threshold.
Another filtering approach that has been used involves the application of anisotropic diffusion filters to effect the blurring. Anisotropic diffusion filters are somewhat like low pass filters, except that the blurring is reduced in regions where the gradient of the voxel intensities is high, for example, edges. In cases where objects possess edges, for example, discrete anatomical structures, such an approach might be helpful. However, there are cases where the objects of interest do not possess definite edges at their periphery or where the objects themselves are heterogeneous internally. In the case of objects with very low mean intensities, with mean intensities close to zero and intensity distributions close to or within the level of noise, any gradients surrounding the objects will almost surely exist at a very low level. In each of these cases, an anisotropic diffusion filter would demonstrate limited effectiveness. On the positive side, an anisotropic diffusion filter might be expected to help to reduce the zero mean noise of the background, and in that respect it might aid in the use of simple thresholding to accomplish noise rejection. However, this approach significantly reduces the background noise using an anisotropic diffusion filter that may require the application of multiple iterations of the filter, which causes a significant penumbra of falsely detected voxels around the actual objects. Furthermore, anisotropic diffusion filters are highly parameter dependent, which is inconsistent with applications that desire automation and robustness. Some groups have used edge-finding algorithms themselves with no blurring to identify objects; however, as with anisotropic diffusion filters, these methods do not work when the objects do not possess external edges.
Fuzzy connectedness is a method that has been used to identify objects in images. In practical terms, fuzzy connectedness requires sample points, which is also inconsistent with systems that desire automation. Fuzzy connectedness still hinges on the memberships of single voxels, which is weakest link in the chain, in the terms of the creators of the method. The goal is to identify regions of voxels potentially possessing intensities within the level of the noise floor. By emphasizing the use of chains of connectedness, fuzzy connectedness does not simultaneously emphasize the relative positions of the neighbors with respect to trial voxels and relative to one another. This means that fuzzy connectedness does not make use of the a priori knowledge that one trait of real-world objects is spatial continuity. For example, real world regions or objects in brain MRI images tend to be fat, at least compared to a long, thin, winding chain, and not full of holes. Fuzzy connectedness also does not provide the ability to weigh the size of a region against the memberships of its constituent voxels.
Another prior method in which voxels are accepted as belonging to objects and not belonging to background noise, determines not only whether they exceed a threshold, but some number of their neighbors also exceed the threshold. This method is interesting in as much as it is an initial attempt to balance spatial extent against magnitude of voxel value, in order to establish whether or not a voxel belongs to a real underlying object or whether its intensity is due only to noise. Another method uses fixed dimension kernels, for example, 3×3×3 voxels, and applies a test of significance to help reduce false positives. This method is interesting for the same reasons as the prior method, but this method additionally enforces a higher degree of spatial continuousness. The restricted shape and extent of the regions used by this method, however, make it un-ideal. Another method, demonstrated in the context of surveillance, divides the overall image area into fixed test areas, for example, 4×3 voxels in extent, and applies formal statistical tests to compare the intensity distributions contained by the regions at two time-points, in order to establish whether the contents of each region are the same or different from one time point to the next. This is an interesting approach, for its use of formal statistical tests to decide whether a change had occurred or not. As above, however, the use of fixed position and size test areas is unnecessarily limiting.
Therefore, it would be desirable to have a system and method for accurately and consistently analyzing medical images that does not fall prey to the above-discussed drawbacks.