In the early 1970s traditional projected X-ray pictures were supplemented by computerized tomography, which digitally reconstructed a set of slices from multi-directional views (as from a microtome). This led to much research being directed towards methods of presenting such information visually in 3D, whether by volume rendering the data directly, using various algorithms, or by surface extraction to represent such shapes as a liver or a tumor by a geometric structure composed of polygons or triangles, which on many computers can be colored, lit and displayed more quickly. Soon after the appearance of CT other 3D imaging modalities were developed, such as, for example, Magnetic Resonance (MR) imaging, Positron Emission Tomography (PET), single-photon emission computed tomography (SPECT) and ultrasound (US), which first presented single slices under user control of position and angle, later integrated into the creation of 3D data sets. Beyond the realm of medical applications, other modalities such as seismography and electromagnetic geological sensing are also important sources of volume data, as well as are data acquisition technologies in numerous other fields.
Different modalities yield different information. For example, in medical diagnostics, CT shows bone more clearly than does MR, but MR can identify tumor tissue. This led to a desire for multi-modal imaging so that, for example, a surgeon can see where a tumor is relative to features on the bone that can guide her in orienting herself and navigating through the patient's body. The first requirement for such multi-modality is registration, which brings the 3D images into alignment. Registration is a non-trivial problem, since not only does each modality report in a different system of (x, y, z) coordinates, depending on the sensing device and its location, but often each has different warping of the data, so that straight lines in one data set often correspond to curves in another. Warping is intrinsic in some modalities. For example, the ideal linearly-varying magnetic field strength for magnetic resonance imaging is mathematically impossible in a finite device. To a considerable extent such warping is automatically compensated within the scanner for the controlled environment of medical devices, but this is less true for seismic scans. Medically a perfect match may not exist, if one data set is from a reference brain and the other is from a patient distorted by a tumor, and one has an optimal matching problem. For simplicity, situations are addressed herein in which the registration problem for two or more 3D data sets is assumed to have been solved.
An additional hurdle is that of image fusion, which displays the combined multi-modal information to a user. Whereas registration has an observer-independent criterion of correctness, i.e., data associated with the same coordinates should describe properties at the same physical location in the patient, image fusion raises questions of perceptual psychology. There may be a large variety of data, including not only the numbers in the original scan but derived quantities such as porosity, as well as 3D objects of other kinds, such as extracted surfaces, center curves of blood vessels, possible implants, etc., to display. Hence the term data modalities, as distinct from sensing modalities, is utilized herein to include such additional objects of attention. For a surface constructed, for example, as the boundary of a particular type of tissue, registration is not a concern inasmuch as the surface was constructed with reference to the coordinates of the volume data set from which it was extracted. Nonetheless, the different information conveyed by surface rendering of one and a volume rendering of the other require coordination in display as well as in a user's mind. It follows from comparison of the information to be displayed and the number of pointwise display elements (usually red, green, blue and transparency) that it is simply impractical to display all data at every point.
A shader is a rule assigning display properties to particular values of a data set. It determines attributes such as, for example, color, transparency, simulated light reflection properties, etc. at each point, and is normally chosen so as to make the differences within and between data sets perceptible, such as, for example, the position, size and shape of a particular region of interest, e.g., a region occupied by cancer cells or by petroleum shale. Many alternative shader choices may be useful for the same data set. For example, a rule such as “make everything except bone transparent and hence invisible” results in the display of the skull of a scanned patient as if it was removed from the surrounding flesh, while a rule such as “show all points with values high enough to be tissue and not empty air” displays the skin surface as the boundary of an opaquely rendered region. As well, since hair data often average to a density value which is hard to distinguish from noise, such displays commonly make a patient appear bald. FIG. 1 is an illustration of the fact that for a complex 3D object various different renderings are possible, each with advantages and disadvantages. In the example of FIG. 1, two different renderings 101 and 102 may reveal different features: in rendering 101 a surface or solid structure 110 with a faintly visible core 111, or as in rendering 102, a visible core 120 with other parts of the model suppressed. Thus, the choice of shader is another source of co-registered views, without input from various multiple sensing modalities, whose mutual relations may be important to a user.
One possible multimodal display method is to allow different data modalities to control different color components of the displayed image, in what is termed a “false color” display. For instance, CT-reported density (high in bone) might control the red component, while MR values (with settings adjusted to tumor sensitivity) might control the green. However, it is a fact of psychophysics that red and green light combine to create a distinct sensation of yellow in a viewer's perspective, as if there were a light source at a single yellow frequency intermediate between red and green. Notwithstanding that fact, users do not directly perceive yellow as “red and green combined.” Therefore, an association of yellow with “cancerous bone” must be learned in a particular application, rather than occurring naturally based upon her knowledge that “red is bone” and “green is cancer”. False color thus places new and different training demands on a user for each application, rather than being a solution which is transferable across various 3D data display environments. Further, since human eyes are limited to three classes of color receptors, false color cannot fully represent four simultaneous data modalities. It is therefore necessary in many situations to switch between displays of different data modalities, while retaining, to the extent possible, the context provided by the mutual registration of all the modalities. This need is addressed by the present invention.