The present invention relates to the collection and storage of information pertaining to processing of biological samples.
Devices and computer systems for forming and using arrays of materials on a substrate are known. For example, PCT application WO92/10588, incorporated herein by reference for all purposes, describes techniques for sequencing or sequence checking nucleic acids and other materials. Arrays for performing these operations may be formed in arrays according to the methods of, for example, the pioneering techniques disclosed in U.S. Pat. Nos. 5,143,854 and 5,571,639, both incorporated herein by reference for all purposes.
According to one aspect of the techniques described therein, an array of nucleic acid probes is fabricated at known locations on a chip or substrate. A fluorescently labeled nucleic acid is then brought into contact with the chip and a scanner generates an image file indicating the locations where the labeled nucleic acids bound to the chip. Based upon the identities of the probes at these locations, it becomes possible to extract information such as the monomer sequence of DNA or RNA. Such systems have been used to form, for example, arrays of DNA that may be used to study and detect mutations relevant to cystic fibrosis, the P53 gene (relevant to certain cancers), HIV, and other genetic characteristics.
Computer-aided techniques for monitoring gene expression using such arrays of probes have also been developed as disclosed in EP Pub No. 0848067 and PCT publication No. WO 97/10365, the contents of which are herein incorporated by reference. Many disease states are characterized by differences in the expression levels of various genes either through changes in the copy number of the genetic DNA or through changes in levels of transcription (e.g., through control of initiation, provision of RNA precursors, RNA processing, etc.) of particular genes. For example, losses and gains of genetic material play an important role in malignant transformation and progression. Furthermore, changes in the expression (transcription) levels of particular genes (e.g., oncogenes or tumor suppressors), serve as signposts for the presence and progression of various cancers.
These computer-aided techniques for sequencing and expression monitoring are themselves multi-stage processes including, e.g., stages of selecting sequences, overall chip layout, mask design, probe synthesis, sample preparation, application of samples to chips, scanning of samples, and analysis of scanning results. For each stage, there is associated control information that determines in some way how the processing of the stage is performed. For many stages, there is also result information generated during the stage. Processing at one stage may depend on control information or result information from a previous stage. Thus, there is a need to organize all of the relevant information for convenient access and retrieval.
Many of the contemplated applications of probe array chips involve performing all of the various stages on a very large scale. For example, consider surveying a large population of human subjects to discover oncogenes and tumor suppressor genes relevant to a particular form of cancer. Large numbers of samples must be collected and processed. Information about the sample donors and sample preparation condition should be maintained to facilitate later analysis. The probe array chips will have associated layout information. Each chip will be processed with samples and scanned individually. Each chip will thus have its own scanning results. Finally, the scanning results will be interpreted and analyzed for many subjects in an effort to identify the oncogenes and tumor suppressors. The quantity of information to store and correlate is vast. Compounding the information management problem, equipment and other laboratory resources may be shared with other projects. A single laboratory may service many clients, each client in turn requesting completion of multiple projects. What is needed is a system and method suitable for storing and organizing large quantities of information used in conjunction with probe array chips.
The present invention provides system and method for organizing information relating to polymer probe array chips including oligonucleotide array chips. A database model is provided which organizes information relating to sample preparation, chip layout, application of samples to chips, scanning of chips, expression analysis of chip results, etc. The model is readily translatable into database languages such as SQL. The database model scales to permit mass processing of probe array chips.
According to a first aspect of the present invention, a computer-implemented method for managing information relating to processing of polymer probe arrays, includes a step of creating an electronically-stored experiment table. The experiment table lists for each of a plurality of experiments a first identifier identifying a target sample applied to an polymer probe array chip in a particular experiment, and a second identifier identifying the polymer probe array chip to which the target sample was applied in the particular experiment. The method further includes a step of creating an electronically-stored chip table. The chip table lists for each of a plurality of polymer probe array chips: the second identifier identifying a particular polymer probe array chip; and a third identifier specifying a layout of polymer probes on the oligonucleotide array chip.
According to a second aspect of the present invention, a computer-implemented method for managing information relating to processing of oligonucleotide arrays, includes a step of creating an electronically stored analysis table. The analysis table lists for each of a plurality of expression analysis operation a first identifier specifying a particular analysis operation and a second identifier specifying oligonucleotide array processing result information on which the particular expression analysis operation has been performed. The method further includes a step of creating an electronically stored gene expression result table. The gene expression result table lists for each of selected ones of the plurality of analysis operations, a list of genes and results of the particular expression analysis operation as applied to each of the genes.
According to a third aspect of the present invention, a computer-implemented method for managing information relating to processing of polymer probe arrays includes steps of: storing in an electronically-stored experiment table for each of a plurality of experiments, a first identifier identifying a target sample applied to an polymer probe array chip in a particular experiment; storing in the electronically-stored experiment table for each of the plurality of experiments a second identifier identifying the polymer probe array chip to which the target sample was applied in the particular experiment; storing in an electronically-stored chip table for each of a plurality of polymer probe array chips, the second identifier identifying a particular polymer probe array chip; and storing in the electronically-stored chip table for each of the plurality of polymer probe chips a third identifier specifying a layout of polymer probes on the polymer probe array chip.
A further understanding of the nature and advantages of the inventions herein may be realized by reference to the remaining portions of the specification and the attached drawings.