1. Field of the Invention
This invention relates to the field of visualization of data and relationships between the data, especially in the context of SAR (Structure Activity Relationship) tables.
2. Description of the Related Art
A SAR (Structure Activity Relationship) table is a well known and established concept within the cheminformatics community. A SAR table according to the prior art displays the relationship between chemical structure and activity for a set of chemical compounds in the form of a table of rows and columns. One column contains the chemical structures, while the other columns show other compound properties, or descriptors. The descriptors—typically various biological activity values—are usually numbers, but text information can also be of interest.
A unique compound identifier is also usually included in the SAR table. FIG. 1 illustrates a simplified view of an MDL® ISIS for Microsoft® Excel® spreadsheet, in which, by way of example, benzene, bromobenzene, chlorobenzene, fluorobenzene, benxoic acid, and ethyl benzoate are shown as having the identifiers 2-7, respectively, in column A.
Several known commercial computer programs feature more or less sophisticated SAR table functionality. Examples of such functionality are included in the ISIS for Microsoft Excel and Accelrys® DIVA® programs.
Dynamic Filtering
The concept of dynamic filtering of data sets is not specific to the cheminformatics area; rather, it is a general purpose technique that is applicable to many different areas of research and decision making processes. Dynamic filtering using a set of graphical query devices was first introduced in the products of Spotfire AB of Göteborg, Sweden (also, Spotfire, Inc., of Somerville Mass.) and is disclosed in U.S. Pat. No. 6,014,661 (Ahlberg, 11 Jan., 2000), which is incorporated here by reference.
In the Spotfire® DecisionSite® software product, which incorporates the technology disclosed in U.S. Pat. No. 6,014,661, query devices tied to columns in the data set and different visualizations of the data allow users to dynamically filter their data sets based on any available property, and hence interactively visualize the data. As the user adjusts graphical query devices such as rangesliders and alphasliders, DecisionSite changes the visualization of the data accordingly. DecisionSite also includes several other automatic features, such as initial selection of suitable query devices and determination of ranges, that aid the user not only to visualize the data but also to mine it. When properly used, this technique constitutes a powerful tool that forms the basis for sophisticated data exploration and decision-making applications.
FIG. 2 illustrates one example of how different query devices (a set of check boxes 201 and rangesliders 202, 203, for example) in Spotfire DecisionSite can be used to dynamically filter data points of specific interest to someone working with microarray data. In the illustrated case, only check boxes YC and YD are checked, indicating that only genes on yeast chromosomes C and D for which the activity (here, protein production level), exceeds a certain threshold value (0.2), as measured by their standard deviations, are to be included in the visualization. As the user drags the range slider 203 (shown set at 0.2) for the standard deviation column (StdDev) further to the right, only genes with increasingly higher activity will remain visualized.
Spotfire DecisionSite also includes the ability to display a data set as a table of rows and columns. Such table visualizations can be dynamically filtered just like all other Spotfire visualization types. Table visualizations can include graphics, which allows the basic principle of dynamic filtering to be extended to data types with much greater complexity than numbers and text strings.
One problem with prior art visualization tools, however, is that even those with graphics support cannot dynamically filter and visualize a SAR table, or other visualization in which data such as chemical compounds is commonly represented and best interpreted by some graphical structure. It may be difficult or impossible for a user to readily see that different compounds all include a benzene ring, for example, based on displayed sets of formulae and numbers alone. What is needed is a dynamic visualization technique that overcomes this weakness, especially in the context of a SAR table.