Traditional scatter plots have been widely used to display correlation or association between two variables. A scatter plot is a chart that uses Cartesian coordinates (e.g., x-axis or y-axis coordinates) to display values for the two variables. The data displayed in the scatter plot is a collection of points, each having one coordinate on the horizontal axis and one on the vertical axis. An example of a scatter plot is depicted in FIG. 1, where the horizontal axis variable represented in the example of FIG. 1 is CPU busy (a percentage value), and the vertical axis corresponds to Queue length. CPU busy in the example represents the percentage of time that the CPU is busy, while Queue length represents a length of a queue of jobs waiting for execution by the CPU.
Various points are plotted in the scatter plot of FIG. 1, where the data points correspond to a particular pair of CPU busy value and Queue length value. A first section 100 of the scatter plot represents the correlation between Queue length values and CPU busy values from 0 to about 70%. As indicated in the example of FIG. 1, the Queue length values for data records in the first section 100 are relatively low (50 or below). On the other hand, a second section 102 of the scatter plot depicted in FIG. 1 shows higher values of Queue length associated with higher CPU busy values. The section 102 is considered to contain “exceptional” points, which are data points that represent excessive CPU busy values (e.g., ≧98%) and large Queue lengths (e.g., ≧300).
From the scatter plot of FIG. 1, a viewer may assume that there are not many data points in the section 100 of the scatter plot. Such an assumption may be incorrect, as there in fact may be a large number of data points in the section 100, but the presence of such a large number of data points may be obscured due to overlapping (overlay of) data points (e.g., many data points sharing the same or very similar Queue length and CPU busy values). As a result, a traditional scatter plot can show just a relatively small number of distinct data points, even though there may be a much larger number of data points that the viewer cannot see as a result of overlapping. Such overlapping of data points can hide the true extent of the relationship between different variables in a traditional scatter plot.