1. Field of the Invention
The present invention relates to the display and analysis of quantitative data. More particularly, the invention relates to a method and computer program for simultaneously displaying very large sets of quantitative data without first standardizing or conditioning the data, while still permitting viewers to easily distinguish individual variables and values in the display, and allowing wide control over a color mapping process.
2. Description of the Prior Art
The ever-increasing use of computers has expanded the amount of quantitative data available for analysis. For example, both professional and amateur stock traders now have access to tremendous amounts of data that can be used to track and analyze stocks, and health care providers can monitor a plurality of vital statistics within an intensive care environment. As computers become more ubiquitous, the amount of quantitative data available for analysis is expected to continue to grow at an extremely rapid rate.
Quantitative data is often most useful when it is displayed so that viewers can see or visualize trends or patterns in the data. For example, stock traders often desire to view time series of several stock prices and volumes and to compare the performance of the stocks, or compare entire sectors of stocks made up of hundreds of time series.
Many different types of graphs and data mapping techniques can be used to display quantitative data for analysis purposes including line graphs, bar graphs, area graphs, surface graphs, two, three and four-dimensional contour graphs, bubble graphs, column graphs, heatmaps, treemaps, etc. Many of these techniques can also be combined with color mapping procedures wherein data values within the graph or data map are indicated through means of a color display or color xe2x80x9cvaluesxe2x80x9d. Color can be used to indicate a value and also to enhance certain characteristics of the data, or to indicate priorities or alerts. Color mapping is used, for example, in line graphs, contour maps, heatmaps, treemaps, and in imaging applications including medical imaging, radar, and other sensor data display.
Unfortunately, these prior art graphs, data displaying, and color mapping techniques suffer from several limitations to their utility. One limitation is that prior art graphs are generally limited in the number of variables which they can display simultaneously while still allowing differentiation of individual variables and/or values within the graph. For example, a line graph with six or more time series or variables typically looks cluttered and the data therein becomes intertwined, making differentiation between individual time series variables difficult if not impossible. One solution is to stack multiple graphs to view simultaneously, but this solution is limited in the number of graphs that can be displayed and it can be difficult to make comparisons across the graphs.
Another limitation of prior art graphing and data displaying techniques is that for a number of series to be effectively graphed the data needs to be in a relatively narrow or common range. This problem occurs both in line graphs and in color contour mapping. On a line graph, when the data is not in a common range, it must be transformed, normalized, or standardized to a common scale, or a variable must be selected which is in a common range. If not in a common range some of the time series may be difficult to distinguish and appear no different from zero. For example, if ten stock prices, a market index, and a market volume are to be graphed together and the stock prices have values ranging between $4 and $250, the market index has values ranging between 5,000 and 10,000, and market volume has values between 500 million and 2 billion, on a common graph scale the time series with the lower values become indistinguishable from zero. One solution to this problem is to provide separate Y-axes, however, this solution is limited to graphs containing only a few variables. In many cases the observer desires to see as many variables simultaneously as possible. Another solution is to mathematically transform or normalize the data to a common range, for example, by taking the logarithm of the data, standardizing the data, mapping the raw data to a relative index based on a reference point or to some other variable with a common range, for example percentage change. This solution is limited because observers may have difficulty inverting the transformation to determine the actual value of the raw data which may be of interest, and also the need to choose variables with a common range severely limits the choice of data. In addition, data transforming, conditioning, or normalizing for graphing or other displays can be a laborious effort and often needs to be done on a case by case basis, the approach to use depending on the type and particulars of the data.
In a limitation related to time series being indistinguishable from zero on a line graph, treemaps are displayed with an xe2x80x9carea-codedxe2x80x9d variable which determines the size of a rectangle displayed on the screen, small area coded variables are difficult to distinguish or find on the treemap graph, the analog of being indistinguishable from zero on a line graph. Treemaps can be rotated to view from different angles but the process does not guarantee that small area-coded variables will be found, and the correct angle for viewing is uncertain.
A similar limitation related to scale exists in color mapping of data using contour color mapping approaches as found in two, three, and four dimensional contour maps, in heatmap and treemap applications, and in imaging. Contour color mapping uses the entire data space or data matrix (image) as the basis for the color process. Unless the data is in a common range, it may happen that only the most extreme colors in a color spectrum made up of a number of colors ordered from high to low, will be used. In the example of stock prices, market index, and market volume, mapped on the same contour graph or on a heatmap or treemap, the volume will use only the highest color and the small stock prices use only the lowest color. The color mapping process loses all its details. The prior art solutions to this problem are generally the same as used on line graphs. Mathematical transformations to a common range are used or the choice of variables is limited to those in a common range. These approaches suffer from the same limitations as the line graph solution, it is difficult for the observer to invert the transformed value to relate to the raw data value, and restricting the choice of variables to those in a common range severely restricts the utility of the approach.
Another limitation related to scale occurs with long and trending time series. Displayed on a line graph, when a narrow area (domain of time) of the graph of the trending series is viewed, with the scale or Y-axis set for the full data set, the area viewed appears flat. A similar problem happens in color mapping, when a small domain of the time series is viewed with the color process based on the full time series domain, the part viewed utilizes only a narrow band or few colors of the color spectrum, the analog to appearing flat on a line graph.
A related limitation is the effect of outlier data points on color mapping. An outlier is a value within a data set that is significantly different from the range of the rest of the data, e.g., beyond plus or minus 3 standard deviations. Outliers affect the color mapping by xe2x80x9cabsorbingxe2x80x9d many colors. That is, the outlier will be assigned the highest (lowest) color in an ordered color spectrum, and there will be many colors unused between the outlier and the rest of the data, and then leaving relatively few colors to differentiate the range where most of the data is located. One solution is to provide methods to remove the outliers. Another solution is to index the outliers under a mathematical transformation. However these solutions are limited in their scope and capability.
Another limitation of prior art graphing and data displaying techniques is that they cannot effectively display data sets that are highly variable, or spiky, with many extremes. The many data points representing the extreme highs and lows can be difficult to visualize and can hide the remaining middle of the data space. For example, in three and four dimensional surface and contour graphs, many extreme highs and lows create a surface with a large number of peaks and valleys (spikes) at the extremes. The extremes themselves can be difficult to see and the middle of the data space can be hidden from view by the surface effects. One solution is to rotate the graph to view it from many angles, but finding the correct angle is uncertain and even when rotated many of the spikes are difficult to distinguish and the middle values are still difficult to view. The data can be normalized to a common range, however that suffers the same limitations as previously described.
Another limitation of color mapping processes as used in heatmap and treemap applications is that they graph only a single time point or observation for each time series. This approach makes it impossible to see local trends. One solution has been to show, for example, the change in a stock price from some specified point to another point in time. However, this solution is limited and fails to show the local detail. The observer often desires to see many time series as well as many time points and trends simultaneously and to see them for many differently scaled data sets.
Another limitation in prior art color mapping has been the limited controls available to affect the number to color mapping process. Prior art in graphs and data maps provides controls to change the color of a line or between one or another color spectrum. However, color mapping of data has several sub-processes at work, including the process of color spectrum construction, the definition of the data set or area of the data set used as the basis for color mapping, and the number to color mapping function. Each process can provide controls which when combined make a huge range of effects possible with great utility in the display of data.
Another limitation of prior art graphing and data displaying techniques is that the images created are generally limited and static. The data in the graph or map is selected, any sorting or arrangement is made, and the image then constructed. Data may be provided on an updating basis with varying alerts or values indicated but the variables on the graph or map are set and limited. An area of the graph may be selected to view, changes may be made to the axes (for example, show on a logarithmic scale), and choices may be made from predefined color spectrums where color is part of the display process, but other than these few choices, the image is fixed. The program user or data observer cannot xe2x80x9cflyxe2x80x9d over data to search for and visualize patterns and trends in virtually unlimited data spaces dynamically changing by characteristics of the data and by observer controls.
The present invention solves the above-described problems and provides a distinct advance in the art of quantitative display techniques. More particularly, the present invention provides a method and computer program for simultaneously displaying large and even unlimited data spaces of quantitative data without suffering from the limitations described above. This is achieved by combining methods to construct data grids of a virtual data space, with methods and controls for a number-to-color mapping process, with methods and controls for xe2x80x9cmovementxe2x80x9d, and with a range of mathematical operations.
The method and computer program of the present invention permit the simultaneous display of very large and even unlimited databases of time series or other ordered data sets so that a viewer can simultaneously visualize trends and patterns in the data. The computer program and method achieves the foregoing while still permitting viewers to easily distinguish individual variables and values in the data space and simultaneously displaying many variables as well as many data points over time.
The method and computer program also permit the simultaneous display of data sets with widely different ranges and a high degree of variability without first conditioning, transforming, normalizing, or standardizing the data sets. This permits observers to readily compare different data sets without the need for significant pre-conditioning of the data or selection of variables in a common and narrow range.
The method and computer program also permit an observer to interactively and dynamically reorder, sort, categorize, and transform a displayed data space or parts of it. The invention permits the observer to apply any of a variety of mathematical operations to create new data spaces or to modify the display of the current data space.
The method and computer program also permit an observer wide control over a number-to-color mapping process (associating a particular color with a particular numeric value). This is accomplished with controls to construct or select color spectrums, controls to change the spectrum display in ways to emphasize or hide certain data or certain categories of data as well as to reveal patterns in the data, controls to select subsets of the data space to color map independently, controls to set the particular number to color mapping function, and with controls to select a retrospective or an animated real-time color mapping mode.
The method and computer program also permit an observer to view a display of quantitative data within a xe2x80x9cdata spacexe2x80x9d and to fly over and through a data space built with perspectives and three dimensional effects. The result is to move the observer from the mind set of xe2x80x9cviewing a graphxe2x80x9d, to one of xe2x80x9cflying through a data spacexe2x80x9d. Movement can be vertical (to see more or less data or surface area on the screen), or horizontal (through time and across variables), and the motion can be combined. Movement controls and flying beyond the edges of the screen allow virtually unlimited data spaces to be accessed and visualized.
These and other important aspects of the present invention are described more fully in the detailed description below.