This invention relates to the field of computer user interfaces and more specifically, to a user interface providing information visualization with an interactive control to allow a user to alter the visualization as desired.
In this age of information, vast amounts of data are continuously being gathered in many different industries. There are a seemingly endless number of data sources including mobile phone calls, airplane take-offs and landings, computer help desk calls, downloading of movies or music, and many others. It is desirable to analyze the data so that relationships may be determines, so that resources may be more effectively allocated and customers and users can be better served.
Trying to determine a relationship for a given data set or data is generally a difficult task, especially when the data is distributed in a nonuniform or nonlinear way. Many existing interactive systems for helping people understand or manipulate data employ linear interpolation techniques which make the implicit assumption that the data are uniformly distributed within their range from the minimum data value to the maximum data value. However, one should not assume real-world data sets to be distributed uniformly across their range. In fact, it is rare for data that is sampled from real-world problems and applications to be uniformly distributed.
Linear interpolation techniques are well known to be susceptible to a variety of quantization errors for data sets that are not distributed uniformly. Two specific examples of a problem occur in interactive controls for specifying a single data value and in graphs, charts, and other visualization techniques in which data values are displayed using a range of colors of gray values.
As a first specific example, a typical task when understanding or manipulating data is to specifying a single data value from within the range of values in a data set, such as when setting a threshold value for a filter. User interfaces that allow people to specify a single data value typically employ interactive controls such as slider bars or dials. Such interactive controls typically use a standard linear mapping between values of the slider's marker position and data values within the data range.
Linear mappings are most effective when the data are distributed uniformly, because then equal movements of the slider bar specify equal-sized portions of the data range. Linear mappings are less effective when the data set is distributed nonuniformly. In the worst case, the slider bar may be particularly ineffective when most of the data is concentrated at one end of the range, because adjusting it a single pixel at that end may skip most of the data values, while adjusting it several pixels at the other end may select only a few data values.
For specific types of distributions, there may be a mapping that approximates a solution, such as a log mapping for data values that are distributed exponentially. It would be better, however, to have a “nonparametric” slider bar that can analyze the data and automatically calculate an optimum mapping that would work for any distribution (e.g., log, exponential, bimodal, uniform, normal, and so forth).
As a second specific example, a similar problem occurs for data visualizations that attempt to map a range of colors to a range of data values: it is difficult to use colors effectively unless the data values are distributed uniformly within their range. As with the interactive control problem, it would be better to have a data visualization component that can analyze the data and automatically calculate an optimum mapping that would work for any data distribution while still retaining the general shape of the original distribution.
Therefore, it is desirable to provide interactive applications that allow people to understand and manipulate real-world data sets from a wide variety of contexts. An improved data visualization technique with a user interface having interactive controls for altering the visualization as the user desires is needed. The technique will apply nonparametric techniques such as histogram equalization, so that no assumptions are made about the actual data distribution.