The present invention relates to the field of computer-assisted analysis of biological information. In particular, the present invention relates to a method and system for management of a database containing biological response signal data and for presentation of useful analytical displays of information therefrom.
The analysis of complex systems such as biological organisms is aided by the use of relational database systems for storing and retrieving large amounts of biological data. The advent of high-speed wide area networks and the Internet, together with the client/server based model of relational database management systems, is particularly well-suited for allowing researchers to access and meaningfully analyze large amounts of biological data given the appropriate hardware and software computing tools.
Computerized analysis tools are particularly useful in experimental environments involving biological response signals. By way of nonlimiting example, biological response signal data can be obtained and/or gathered using biological response signal matrices, that is, physical matrices of biological material that transmit machine-readable signals corresponding to biological content or activity at each site in the matrix. In these systems, responses to biological or environmental stimuli may be measured and analyzed in a large-scale fashion through computer-based scanning of the machine-readable signals, e.g. photons or electrical signals, into numerical matrices, and through the storage of the numerical data into relational databases.
As a further nonlimiting example, biological response signal data can be obtained and/or gathered using serial analysis of gene expression (SAGE) or other technologies for measuring gene/protein expression levels that may not use a matrix or microarray but otherwise produce measurable signals. Generally speaking, biological response signals may be measured after a perturbation of a biological sample including, for example, the exposure of a biological sample to a drug candidate, the introduction of an exogenous gene into a biological sample, the deletion of a gene from the biological sample, or changes in the culture conditions of the biological sample.
A useful outcome of the scientific experimentation being performed involves the understanding of the relationships between genes and perturbations, understanding that promotes other useful outcomes such as the invention of new drugs or other therapies. Often, relationships between perturbation and gene expression levels sheds light on known or unknown biological pathways. There is an ongoing need in the art to generate better and more useful ways for computers to assist in analyzing the large volume of biological response data that can exist for even the most simple biological organisms.
A system, method, and computer program product are provided for improved computer-aided analysis of biological data derived from machine readable outputs of experiments performed on a plurality of biological samples. Responsive to search and execution commands from the user, a plurality of biological viewer windows are spawned on a user computer display. The user may then select a source dataset from one of the biological viewers and execute a project selection command, wherein the source dataset is then projected onto the other biological viewers. The projections are characterized by a highlighted display of biological data points in the destination biological viewers corresponding to items in the source dataset. The selected data is highlighted in the destination biological viewers using contrast or color differentiation from other destination window data.
In another preferred embodiment, the user may spawn a hierarchical cluster tree biological viewer that displays genes or experiments grouped based on similarity of behavior, wherein the hierarchical cluster tree is displayed in a hyperbolic display fashion. In one form, the hierarchical cluster tree may be, for example, a gene coregulation tree. When displayed in a hyperbolic display fashion, convenient viewing of the hierarchical cluster tree is enabled, whereby the user may move around the tree and zoom in and out of various areas of the tree without losing perspective of their current location relative to the xe2x80x9crootxe2x80x9d of the tree.
In another preferred embodiment, biological menu and submenu items that are displayed to the user during searches, projections, and the like are not stored in the user computer, but rather are stored in a central biological response database. Biological menus and submenus are generated at startup based on queries to the central biological response database, allowing for increased flexibility, changeability, and customization of the biological menus and submenus.
In another preferred embodiment, correlation data between expression array experiments is precomputed when the experiments are added to the central biological response database. This eliminates the need for real time computation of correlation coefficients or other similarity scores by the user computer, resulting in considerable time savings when the user requests correlation data among selected sets of experiments.