Often, two or more different data sets of information about an entity are available in two or more different modalities. A classic example of this occurs frequently in the field of medical imaging. For instance, an MRI (magnetic resonance imaging) scan of an anatomical feature of a patient as well as a CT (computer tomography) scan of the same feature may be available. One of these images, most likely the MRI scan in this example, may have a higher resolution than the other image. Nevertheless, the CT image may show certain features with much better clarity than the MRI image (e.g., bone), whereas the MRI image may show a different feature (e.g., soft tissue) with much better clarity than the CT image. Often, this is exactly why two images of the same object are obtained in different modalities.
The image(s) in one of the modalities may have a higher resolution that image(s) in the other modality.
Another example of a common situation in which two images (or datasets) are available describing the same entity occurs in geographic mapping. Often a relatively low resolution terrain map of a geographic area is available as well as a higher resolution aerial photograph of the same geographic area. For example, ten-meter resolution terrain elevation data and corresponding one-meter resolution orthophotos are available online for most of the United States from the USGS's Seamless Data Distribution System.
As yet another example, visible and thermal IR images of the same scene may have different effective resolutions because the sensors have different resolutions or because one camera is focused on a narrow part of the scene.
Several techniques are known for computer enhancing the resolution of a given image in a given modality in order to improve the clarity of the image. One of the simpler techniques is interpolation or extrapolation. In such a technique, typically some form of averaging of the values (e.g., intensities) of adjacent pixels is done in order to fill in (or add) pixels between those adjacent pixels. In image processing, a standard interpolation technique is bilinear interpolation. This technique uses a weighted average of the four existing pixel values to the upper left, lower left, upper right, and lower right of the location that is to be filled in. One way to think of interpolation is to fill in values by assuming smoothness.
Another technique for enhancing resolution of an image is superresolution. Usually in superresolution, multiple images of the same object in the same modality are obtained. The images might be obtained from slightly or vastly different perspectives. The images are registered with respect to each other, i.e., the translation and location parameters that properly align overlapping views of an object are determined so as to reconstruct from these partial views an integrated overall representation of the object. Then, the sub-pixel motions between the images estimated by the registration are used to construct a single image that has a higher resolution than any of the single images.
This process has been used with multiple images from the Hubble telescope to increase the telescope's effective resolution. A superresolution approach that uses multiple images of terrain to construct a three-dimensional surface with higher resolution than would be possible using only stereo vision with two images is described in P. Cheeseman, B. Kanefsky, R. Kraft, J. Stutz, and R. Hanson. Super-resolved surface reconstruction from multiple images. Technical Report FIA-94-12, NASA Ames Research Center, December 1994.
Another technique is texture synthesis. In texture synthesis, a statistical model of an image texture is created from existing images that demonstrate the texture. Then, that texture model is used in a new image to fill holes with a similar, but unique, instance of that texture. For example, if it is necessary to remove a person from a field of grass in an image, one might model the grass texture from the image and then write over the person in the image with newly synthesized “grass” that is visually consistent with the other grass in the image. See, e.g., S. Zhu, Y. Wu, and D. B. Mumford. FRAME: Filters, random field and maximum entropy: towards a unified theory for texture modeling. International Journal of Computer Vision, 27(2):1-20, March/April 1998.
Surface hole filling is another technique that can be used in connection with 3-D imaging in which holes in a 3-D surface are filled in with a surface patch that is consistent with the hole border's geometry. See, e.g., J. Davis, S. E. Marschner, M. Garr, and M. Levoy. Filling holes in complex surfaces using volumetric diffusion. In Proceedings of the First International Symposium on 3D Data Processing, Visualization, and Transmission, June 2002.
It also is known to register images from two different modalities with each other. For instance, this is known in the field of surgical navigation in which real-time infrared images of the surgical field containing a patient are obtained and registered to previously obtained images, such as CT images of the patient. This, for instance, is often used to help guide a surgical instrument in real-time towards an anatomical feature, such as a tumor.
Furthermore, a technique known as shape-from-shading is a process of recovering 3-D shape information from a single image using a known mapping from image intensities to possible object surface slopes. Shape-from-shading was originally developed by B. K. P. Horn and colleagues and is described in his textbook (B. K. P. Horn, Robot Vision, MIT Press, Cambridge, Mass., 1991). While shape-from-shading does infer a data set of one modality (depths) from a data set of a second modality, it does not exploit values of the second modality that do already exist, such as sparse values in a low-resolution image.
None of the aforementioned image processing techniques involves enhancing the resolution of an image in one modality based on a data set in another modality.
Accordingly, is an object of the present invention to provide a new and improved image processing technique.
It is another object of the present invention to provide a method and apparatus for enhancing the resolution of a data set in a first modality based on information in a data set in a second modality.