1. Field of the Invention
This invention relates generally to the visual display of data and, more particularly, to filtering for data visualization techniques.
2. Description of the Related Art
In an increasingly competitive world, enterprises are constantly in need of business intelligence that empowers the decision makers in the organization to act on the information, and thus impart extra competitive edge to the organization's products and services. Businesses succeed or fail based on their ability to accurately quantify how many leads become orders, identify their most profitable customers, forecast manufacturing capabilities, manage reliable supply chains, and create sales projections, for example.
However, obtaining information on which decision makers can act presents several practical challenges. One such challenge is the massive amount of data available to the enterprise in today's Information Age. Conversion of data to information which can be readily understood is a significant obstacle. Additionally, enterprises today have data spread over multiple data sources ranging from legacy systems to relational databases and text files. Even if these problems are surmounted, publishing information in a secure and reliable manner remains another concern for enterprises.
Reporting systems with data visualization functionalities can provide users with the capability to convert diverse data into information that can be easily visualized and deciphered to exploit the information and learn more about the business. Visualization components can emphasize high-level patterns and trends in large and complex datasets. One way of presenting vast amounts of data as comprehendible information is by representing the data in a treemap format. A treemap is a visual representation of a dataset, which is typically hierarchical in nature.
A treemap generally includes a collection of two-dimensional cells of rectangular shape, each of which represents one or more data entries of the dataset. The cells of a treemap have characteristics, such as area, color, and texture, that represent the data. The cell characteristics may also be known as graphical attributes. If the dataset is in the form of a table in a database, the rows of the table may be represented by treemap cells and the columns of the table may represent various data dimensions. A data dimension is a set of related data values such as the values in a column of a database table or correlated fields in an XML file that are marked with a common tag. The data dimensions may be mapped to different cell characteristics of the treemap visualization. Thus, a viewer of the treemap can gain insight into data by examining a grouping of cells and cell characteristics.
One barrier to the wide use of data visualizations is the limitation in available features which make the visualized information more meaningful to users. For example, current treemap solutions do not provide for ways to vary an aggregation function used for generating the data visualization. End users may have certain expectations about how the areas of the lowest-level groups are calculated and these expectations may have an affect on the utility of the treemap. For example, when the data values mapped to the innermost rectangles are average data values, such as average page load time, end users may expect the relative areas of the lowest-level groups to also be averages. Current versions of treemap components do not address this issue, but instead have a fixed method for determining the areas of the lowest-level groups, which are typically implicit in the graph's definition and construction. Typical methods include the fixed methods of either summation (setting the relative areas of the groups to the summation of the values within each group) and count (setting the relative areas of the groups to the total number of values within each group). It would be useful to vary the aggregate function that is used to represent groups at different hierarchical levels of a hierarchical data visualization.
Another barrier to the use of data visualizations is that typical solutions provide default visible depth levels which cannot be modified by users. In order to change the currently viewed hierarchy level, other visualization techniques provide a drilling option, which shows a lower depth level for a selected cell. A sliding window which indicates the number of depth levels that are currently visible may be shown when drilling down. However, the only depth levels that are shown are those that are in the current representation. Thus, users can easily get lost because there is no indication of an overview of how the current view corresponds to the entire hierarchical data set.
Moreover, visualization techniques tend to emphasize a small number of primary or first-order effects, making it difficult to appreciate secondary or second-order effects. For example, a plot of a data set with values that are distributed non-uniformly will invariably emphasize the most unusual data values, the outliers. Almost any plot of the data set {1,2,3,4,5,1000000} will reveal that one value is unusual, but it may make it difficult to appreciate the linear relationship of the similar values. Filters are used to isolate certain ranges of the data values to be displayed in the data visualization. Generally, prior art methods filter based on user-selected ranges. However, a user is unable to easily effectuate filtering using these ranges when the user has quickly isolated the cells on the treemap which illustrate the first order effects. Moreover, filtering based on ranges may have the added disadvantage of simultaneously hiding multiple data values at different depth levels, causing dramatic changes to the appearance of the data visualization. In addition, it may be difficult to model the data values that contribute to the first order effect with a filter that is set up in advance.
Further, solutions are incapable of linking selected portions of the graphical visualization to related information without serious drawbacks. Current data visualization techniques include actions that drill-in to expose details of a selected cell. These drill-in techniques have the disadvantage that they must be pre-programmed into the component's code. Moreover, the drill-in action is typically limited to actions that can only be accomplished by the component itself. Essentially, the drill-in function is narrowed to initiating actions which have been explicitly anticipated by the authors of the visualization component.