1. The Field of the Invention.
This invention relates generally to the organization and understanding of data generated in multi-dimensional space. More specifically, it relates to the visualization of multi-dimensional data points plotted in a parallel coordinate system. This enables an observer to more readily identify relationships within that data by relative movement through and around those data points with the assistance of a computer and its output display.
2. The State of the Art
Problem solving in multi-dimensional space is an increasingly important field of research. For example, there are presently at least 45 University, 22 government, 11 commercial, and 5 military-based research centers devoted to studying and developing techniques and technologies for visualizing data plotted in multi-dimensional space. As would be expected, most of these efforts are directed at three-dimensional (3D) visualization of essentially 3D or two-dimensional (2D) data sets. These efforts include 3D plots, graphs and projections of data, as well as 3D animation. Some of these efforts include multi-dimensional data sets and 3D representations of the data. A few are known to even utilize the concept of parallel coordinate space in a limited number of dimensions.
It required an advance in science and art as great as the Renaissance to bring about the development of a true three dimensional visual perspective in art. Likewise, the reality of Euclidean geometry was upended with the development of Riemannian geometry which allowed for curved space and higher dimensional geometries. However, although Riemannian geometry ushered in the concept of the fourth dimension, it was not easily accepted because four-dimensional geometry could not be seen or visualized. It is noteworthy that Michio Kaku, an expert in the field of higher dimensional space and author of HYPERSPACE: A SCIENTIFIC ODYSSEY THROUGH PARALLEL UNIVERSES, TIME WARPS, AND THE 10TH DIMENSION, stated in 1994 that "h!igher dimensional spaces are impossible to visualize; so it is futile even to try."
That the experts in the field of multi-dimensional space are in agreement that multi-dimensional space can not be visualized is not surprising. It is well known that we are unable to see more than three mutually perpendicular coordinates at the same time. Presently, the best that we can do is to utilize the techniques pioneered by Edwin A. Abbott and mathematicians Charles Hinton and later Thomas Banchoff. They developed the concept of using insights gained in one dimension to understand the next. In this context, the development of the Hypercube and Teseract enables us to see projections in three-dimensional space of four-dimensional orthogonal space objects. However, direct observations of the four-dimensional objects are still beyond our ability to visualize when using orthogonal representations. Efforts to overcome this inability have therefore focused on the development of computer graphics engines which can simulate higher dimensional geometry. See generally Beyond the Third Dimension: Geometry, Computer Graphics and Higher Dimensions, Thomas F. Banchoff, Scientific American Library, New York, 1996.
A recently developed technique for visualizing n-dimensional spaces utilizes a parallel coordinate system. FIG. 1 shows the conventional three-dimensional orthogonal (Cartesian) coordinate system and a point P1 plotted with respect to all three axes. FIG. 2 shows the same point in a parallel coordinate system where any number of dimensions (which is to be referred to hereinafter as "n" dimensions) are presently shown as separate and parallel axes perpendicular to a base axis in conventional 2D display or diagrammatic space.
In parallel coordinate space, the point P1 in conventional 3D coordinate space of FIG. 1 maps into a set of points (x1, y1, z1) as shown in FIG. 2. FIG. 2 is very typical of the Display Space representation used by Inselberg in that the set of points are now represented as a jagged line drawn so as to connect them. Disadvantageously, this diagrammatic convention adds jagged line clutter such as shown in FIG. 4. Moving visually with respect to the data space minimizes this clutter, but the lines connecting the data points are called "artifact" which have been added to facilitate visual correlation between the ordered data points.
Understanding the concept of artifact is central to understanding the benefits of the present invention. Therefore, separating what is shown in Display Space from what is shown in Data Space is important in visualization methodologies. Use of extra lines such as those showing correlation between particular data points (as done by Inselberg et al.), distance between coordinate axes, orientation (angle from or between) coordinate axes, color of those coordinate axes are all vital parameters for use (or for not being used) in Display Space. Additionally, each of these parameters in Display Space can be made a function of one or more of the coordinate values (or other values such as probability) in the Data Space.
In order to better visualize artifact, specific significance in the Display Space methodology used in the present invention is accorded the introduction of a "vertical" displacement of the values for each coordinate, shown in perspective view in FIG. 12A. This is in distinction from the displacement "along" the coordinate of the Data Space values as used by Inselberg et al. for the same data as shown in FIG. 12B. Inselberg's convention for showing coordinate values xi, for example, is to have each displacement xi shown as a distance along the coordinate rather than as a displacement "above" the coordinate as shown in FIG. 12A. In this convention, the constants d, k and h are all constants for any given display instance.
In Display Space as shown in FIG. 12B, the coordinate values (x1, x2, x3 . . . ; y1, y2, y3 . . . ; z1, z2, z3 . . . ; w1, w2, w3 . . . ; etc.) are shown in perspective and displaced vertically with respect to each coordinate axis by the distance h which is the actual (or normalized) height corresponding to each value xi, yi, etc. Each data point is separated from the previous data point by a common value d along the coordinate axes, x, y, z, w, etc.
This Display Space convention is quite different from that employed by Inselberg et al. in which the data points xi, yi, zi, wi, etc. are plotted on their coordinate axes at displacements along the axes that are proportional to their values xi, yi, zi, wi, etc., and as shown in FIG. 12B.
In addition, Inselberg et al. connect the data points (x1, y1, z1, w1, etc.), (x2, y2, z2, w2, etc.), (x3, y3, z3, w3, etc.), etc. with a display space line for each point in the parallel coordinates. Showing this correlation between points 1, 2, 3, . . . by use of lines connecting these points is sometimes useful, and it is usually confusing in both Inselberg's static views of his parallel coordinate spaces and in our dynamic views of parallel coordinate spaces. When using Inselberg's convention, artifact lines connecting the related points (x1, y1, z1, w1, etc.) are necessary in order to show which of the data points are those corresponding to a given single point in orthogonal space. In contrast with the convention used in the present invention, this relationship is self-evident to the viewer without the artifact of extra lines drawn in the diagram. Consequently, showing these data connecting lines, showing the coordinate axes themselves, or showing lines along the axes connecting the data points of each coordinate becomes a Display Space choice that can and should be made uniquely for each case study.
Therefore, it should be understood that connecting lines and the coordinate axes are display artifact. Artifact is any graphical construction used in Display Space that is not data point identification. Examples of display artifact or display space artifact are lines used to interconnect data points along the direction of coordinates, lines used to interconnect the several different values for a given data point (xi, yi, zi, wi, etc.) along the coordinates, the coordinate axes themselves, the elevation corresponding to the value (magnitude) of each point for each coordinate, the separation of these points along the direction of the coordinate axes, the separation of the coordinate axes one from another, and the brightness or the color or the occulation frequency of any of these prior items of display artifact including the "dot" representing the data point itself.
This extensive list of examples indicates that such artifact needs to be used with discretion. This is because the vital information is the data space points themselves which can be shown alone in Display Space, or in conjunction with any artifact of choice such as those already mentioned.
It is also possible to assign values to artifact. For example, the distances d in FIG. 12A could be assigned as a function of a dimension w, the orientation (angle) of the coordinate axis of x could be a function of dimension u, or the separation between the parallel coordinate axes for x, y, z, w, etc. could be a function of a dimension v, or the color of selected data points could be a function of a probability of those selected data points being at some selected value or criteria from the Data Space.
It should now be apparent that the distances h, k and d as shown in FIG. 12A are also artifact in Display Space. Any of these artifacts, including all the types of artifact described previously, can also be made a function of any or some of the data (from Data Space). Of particular importance is color. Color assignments to data points or to interconnecting line artifact can be made a function of any of the data (coordinate, dimension or parameter) values in Display Space. Such assignments can significantly enhance (or detract) from visualization cognition. Visualization cognition can also be enhanced by occulating selected data points in Display Space according to selected criteria or data from within the Data Space behavior of the system being displayed.
FIG. 2 also shows that simply by adding more parallel coordinate axes, the parallel coordinate system can be extended so as to map many dimensions as represented by the axes labels (x, y, z, . . . w). Thus, a "w" dimensional point P.sub.W1 from orthogonal space (but not shown in FIG. 1) is mapped into parallel coordinate space by the line connecting the set of points (x1, y1, z1, . . . w1) as shown in FIG. 2. The lines connecting these w.sub.i points in parallel coordinate representations as shown in Inselberg's display convention to maintain visual correlation between successive points i.
An important distinction needs to be made regarding the display of multi-dimensional information. Multi-dimensional informational data spaces are to be assumed to be displayed visually in conventional (also referred to as orthogonal) three dimensional perspective display space. Accordingly, any diagrammatic methodology such as a computer output display device can be used to represent, in the display space, the data spaces of interest.
Several different techniques are utilized to represent multi-dimensional data in data space. For example, cartesian methods are known to those skilled in the art, as are the hypercube and teseract extensions thereof. There are also parallel coordinate methods, and newer methods such as arbitrary coordinate mappings. FIG. 3 is an illustration of data space mapped in an arbitrary coordinate mapping. FIG. 3 shows that in the data space of interest, the axes which are shown as being parallel in parallel coordinate methods can be arranged in arbitrary positions relative to each other. Furthermore, the arrangement of axes can be accomplished in conventional two dimensional or three dimensional display space. FIG. 3, for example, shows that the axes (also known as coordinates or dimensions) can be arranged with any desired orientation among them. For example, note that the x' axis in FIGS. 2 and 3 is the same x axis of the orthogonal coordinate system of FIG. 1.
The purpose of these apparently arbitrary orientations could be to highlight or emphasize particular relationships of interest which occur within the data itself. In certain instances, three dimensional orthogonal space orientations of the several coordinates of interest could be utilized to represent the data space.
In econometrics and other business or sociological applications, for example, the many dimensions or variables of interest may not be known to be orthogonal (mathematically independent). Thus, preservation of geometric properties among the mappings shown in FIGS. 1, 2 and 3 is not mathematically relevant. However, when particular relationships or structures are discovered in the data of interest, it could be that such relationships are clarified or made more evident by particular (and no longer arbitrary) orientations of the coordinates representing those variables of interest.
An alternative method of clarifying, emphasizing or discovering important relationships in the data space is to alter a sequence of the variables. In a similar manner, the orientation or separation of the coordinate axes can also be changed for the same purpose.
Those skilled in the art of alternative mappings of data space also recognize that if there are relationships within the data in one of the mappings, there will also be some kind of relationship shown in each of the other mappings. The visual appearance of those relationships will usually be quite different when different mappings are used. Nevertheless, relationships should be recognizable despite the changes. Furthermore, depending upon the relationships of interest, different mappings, orientations and sequences of coordinates can also be utilized to facilitate discovery, recognition or understanding of these relationships.
In display space, practical implementations of these data mappings on two dimensional static or printed diagrams or displays, or non-animated three dimensional models or computer displays, is hampered by the limitations of the display methodologies. FIG. 4 shows, on a two dimensional piece of paper, a large number of data points plotted using parallel coordinates. This figure is taken from an article by Alfred Inselberg and Bernard Dimsdale entitled Parallel Coordinates, A Tool For Visualizing Multivariate Relations. Multi-dimensional data in parallel coordinates is typically as visually complicated as shown in FIG. 4, when printed or displayed in two dimensional display space. This limitation of two and three dimensional representations of parallel coordinate space is well illustrated in Inselberg's publications. In his public presentations, Inselberg has stated that comprehension, when there are more than six or seven dimensions, gets very difficult even with (his forms of displaying) parallel coordinates. It is arguable that the mapping of FIG. 4 has exceeded the human comprehension threshold.
It would therefore be an improvement over the state of the art to provide a new intellectual structure for visualizing data plotted in a parallel coordinate system representative of multi-dimensional data. The new intellectual structure should facilitate an observer in recognizing relationships or structure when it exists within the data, regardless of the number of dimensions in which the data exists.