The advent of new experimental technologies that support molecular biology research have resulted in an explosion of data and a rapidly increasing diversity of biological measurement data types. Examples of such biological measurement types include gene expression from DNA microarray or Taqman experiments, protein identification from mass spectrometry or gel electrophoresis, cell localization information from flow cytometry, phenotype information from clinical data or knockout experiments, genotype information from association studies and DNA microarray experiments, Comparative Genomic Hybridizaton (CGH) data, array-based CGH data (aCGH) data, etc. This data is rapidly changing. New technologies frequently generate new types of data.
As array-based CGH technology develops, studies using this technology will include ever increasing numbers of arrays from which data is generated. There is a need to visualize such data in the context of a whole study to facilitate visual exploratory analysis of the data in context. Other fields may have the same or similar needs that may be met by a solution to visualize large data sets that may individually be represented in line graph form.
Current techniques for visualizing data typically do not scale well to large numbers of arrays, or do not visualize sufficient detail regarding an individual array when the technique is scalable to display data from a large number of arrays. For example, standard heat map-type visualizations may be employed to represent aCGH data, for example, see Pollack et al, “Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors”, PNAS, Oct. 1, 2002, vol. 99, no. 20, 12963-12968, which is incorporated herein, in its entirety, by reference thereto. While such representations are generally scalable to large numbers of arrays/experiments, it is difficult to explore the details underlying the heat maps. Other software products that share the same limitations include “dchip”, see http://www.dchip.org, BioConductor, see http://www.bioconductor.org, and GeneSpring, see http://www.silicongenetics.com.
Visualization software and systems that are adapted specifically to CGH visualizations tend to show data superimposed on chromosome ideograms, see, for example, currently pending application Ser. No. 10/817,244 filed Apr. 3, 2004 and titled “Visualizing Expression Data on Chromosomal Graphic Schemes” and co-pending application Ser. No. 10/964,524 filed Oct. 12, 2004 and titled “Systems and Methods for Statistically Analyzing Apparent CGH Data Anomalies and Plotting Same”, both of which are hereby incorporated herein, in their entireties, by reference thereto. While this is a natural context for CGH studies, such representations do not scale well for visualizing hundred of experiments simultaneously on a display, for example.
Visualization software and systems for displaying sparse data contained within very large datasets are described in co-pending application Ser. No. 10/918,897 filed Aug. 13, 2004 and titled “System and Methods for Navigating and Visualizing Multi-Dimensional Data”, which is incorporated herein, in its entirety, by reference thereto.
There is a continuing need for methods, tools and systems that facilitate the visualization of larger collections of data, such as data that may be represented as groups of line graphs, in a compact graphical form, and for manipulating such visualizations for viewing and analysis by a user.