1. Field of the Invention
The present invention relates to a bioinformatics approach and, more particularly, to a software method for the conversion, storage and querying of the data of cellular biological assays on the basis of experimental design that allows storage and retrieval of data concerning changes in in-vitro cellular functions associated with stimuli such as cytokines, hormones, chemochimes, transfected genes, infectious agents, and drugs.
2. Description of the Background
To date, paper publications are the principle form in which scientific information is exchanged. However, there is a significant opportunity in the field of bioinformatics in determining how to store and retrieve information electronically so that future discoveries can be made. Key-word search engines like Pubmed(copyright) allow users to find articles based on Boolean combinations of MESH headings, author, or keyword string searches. More recently, interfaces such as ENTREZ(copyright) allow users to cross-reference manuscripts with GenBank sequence entries. Still, current functional bioinformatics approaches are handicapped by the inability to store functional data at all, or by a scattering of data across heterogeneous databases that are difficult to link and query. Specifically, the above-described and other known approaches do not support queries linking a test cell population""s expression of proteins and other traits, to the experimental conditions in which they were measured. For example, key word searches do not enable a user to clearly specify the context in which a cytokine is used. Thus, the query xe2x80x9cINF-Gamma up regulatedxe2x80x9d may retrieve the genes which INF-gamma up regulates, or it may retrieve conditions in which INF-gamma is itself increases. Consequently, as biology moves into the post-genome era there is a need to develop better systems for the storage, retrieval and interpretation of biological information.
U.S. Pat. No. 5,804,436 to Okun et al. shows an Apparatus and Method for Real-time Measurement of Cellular Response in which a homogeneous suspension of living cells is combined with a concentration of a test compound. The cellular response of the living cells is measured in real time as the cells in the test mixture are flowing through a detection zone. The apparatus may be used in automated screening of libraries of compounds, and is capable of real-time variation of concentrations of test and standard compounds and generation of dose/response profiles. This implies some data entry, storage and retrieval. However, the mechanics for the storage, retrieval and interpretation of biological information are not taught or suggested, and it is not clear whether or how a test cell population""s expression of MRNAs proteins and other traits can be linked to the experimental conditions in which they were measured.
It would be greatly advantageous to provide a method for the data entry, storage and retrieval that supports queries linking a test cell population""s expression of genes proteins and other traits, to the experimental conditions in which they were measured, as well as to provide a framework for other, more complex information operations.
Accordingly, it is an object of the present invention to provide a software method for the conversion, storage and querying of cellular biological assay data on the basis of experimental design, inclusive of queries concerning changes in in-vitro cellular functions associated with stimuli such as cytokines, hormones, chemochimes and drugs, transfected genes, infectious agents, or physical perturbations such as temperature or ionizing radiation.
It is another object to support a broad range of data including protein or mRNA expressions, as well as functional cellular data such as apoptosis or adherence.
It is still another object to provide the ability to store heterogeneous data using a single data model in order to minimize difficulties associated in searching multiple databases.
It is still another object to provide for the storage of heterogeneous cell lines.
It is a further object to to allow the measurements, cells and conditions to be coded by user-determined ontologies.
According to the present invention, these and other objects are accomplished by providing a system for the conversion, storage and querying of the data of cellular biological assays on the basis of experimental design. The method employed by the system includes the maintenance of a library of data entry forms inclusive of materials and method forms for prompting a user to enter data characterizing all agents (e.g., culture conditions) applied to a test cell population (inclusive of both an experimental group and a control group), experimental design forms for prompting a user to enter data characterizing the experimental design (inclusive of all test agents, control agents and additional agents), and experimental results forms for prompting the user to enter data characterizing an experimental effect of a specific agent on an experimental group as compared to the control group of a test cell population.
Preferably, the materials and methods data entry forms include an agent library form, a test cell library form, a gene/protein library form, a references library form, and a measurement methods library form. The information gained through the above forms is filtered into and combined with further information collected via the experimental design library which includes a general experimental design form, and an additional agents form. Finally, the experimental results library includes an experimental data form that allows a comprehensive description of the experimental results.
The collected data is stored in respective data storage records inclusive of a first data storage record incorporating characteristics of the materials and methods as entered via the corresponding library of forms (Agent, Test Cells, Target Genes, References (if specified), and Measurement Method). The data also includes a second data storage record incorporating characteristics of the experimental design as entered via the corresponding library of forms (inclusive of all test agents, control agents and additional agents). In addition, a third data storage record is accumulated and this defines the experimental results that quantify the effect of a specific agent on an experimental group as compared to the control group of a test cell population.
The three data storage records are inter-related by one or more shared fields.
In addition to the data entry forms, a library of query forms is maintained for allowing a user to submit queries about the experimental effect of any agent on the test cell population. Separate query forms may be maintained for allowing a user to enter queries related to a cellular biological assay, for allowing a user to enter queries related to genes that said assays are related to, and for allowing a user to enter queries related to combinations of agents used in said assay.
The above-described software method is combined with suitable hardware for implementation of the entire system. The hardware may include a conventional computer workstation with standard internal components such as a microprocessor with peripheral chipset mounted on an appropriate motherboard, storage, a monitor, a modem, a standard input device such as a mouse, and an operating system such as Microsoft Windows. All forms and data libraries may be authored using conventional relational database software such as Microsoft Access(copyright).