Molecular biology research depends on biopolymer analysis. Conventionally, for this analysis, a biopolymer sample is first fragmented into shorter length biopolymer fragments by enzymatic or chemical means. The fragments are distinctively labeled with detection labels and then separated, often electrophoretically. The fragment pattern is then detected to obtain information about the structure and nature of the original biopolymer sample. These steps are typically performed separately with human intervention required to transfer the sample from one step to another.
A well known example of biopolymer analysis is DNA sequencing. See F. Sanger, et. al., DNA Sequencing with Chain Terminating Inhibitors, 74 Proc. Nat. Acad. Sci. USA 5463 (1977); Lloyd M. Smith, et. al., Fluorescence detection in automated DNA sequence analysis, 321 Nature 674 (1986); Lloyd M. Smith, The Future of DNA Sequencing, 262 Science 530 (1993), which are incorporated herein by reference. A prevalent sequencing method comprises the following steps. A DNA sample is first amplified, that is the DNA chains are made to identically replicate, usually by the polymerase chain reaction (PCR). From the amplified sample, nested sets of DNA fragments are produced by chain terminating polymerase reactions (Sanger reactions). Each chain fragment is labeled with one of four fluorescent dyes according to the chain terminating base (either ddATP, ddCTP, ddGTP, or ddTTP). These fragments are then separated according to their molecular size by polyacrylamide gel electrophoresis and the unique dyes detected by their fluorescence. The DNA base sequence can be simply reconstructed from the detected pattern of chain fragments.
Electrophoresis is the separation of molecules by differential molecular migration in an electric field. For biopolymers, this is ordinarily performed in a polymeric gel, such as agarose or polyacrylamide, whereby separation of biopolymers with similar electric charge densities, such as DNA and RNA, ultimately is a function of molecular weight. The prevalent configuration is to have the gel disposed as a sheet between two flat, parallel, rectangular glass plates. An electric field is established along the long axis of the rectangular configuration, and molecular migration is arranged to occur simultaneously on several paths, or lanes, parallel to the electric field.
DNA sequence information is key to much modern genetics research. The Human Gene Project seeks to sequence the entire human genome of roughly three billion bases by 2006. This sequencing goal is roughly two orders of magnitude (factor of 100) beyond the total, current yearly worldwide DNA sequencing capacity. Sequencing of other biopolymers, for example RNA or proteins, is also crucial in other fields of biology. Other DNA fragment analysis techniques, such as PCR based diagnostics, genotyping (Ziegle, J. S. et al., Application of Automated DNA Sizing Technology for Genoetyping Microsatellite Loci. Genomics, 14, 1026-1031 (1992)) and expression analysis are increasing in used and importance.
The need for methods to identify genes which are differentially expressed in specific diseases such as cancer is of paramount importance, for both the diagnosis of the disease and for therapeutic intervention. Identification of genes specifically expressed in different diseases will lead to better classification of these diseases with regard to their biological behavior. A molecular understanding of disease progression is fundamental to an understanding of a specific disease. The identification of molecular diagnostics that correlate with variations in disease state, growth potential, malignant transformation and prognosis will have tremendous implication in clinical practice, including the diagnosis and treatment of the disease.
No current method adequately or efficiently addresses the need to identify, isolate, and clone disease-specific genes. A new biopolymer fragment analysis method has been developed based on the use of arbitrarily primed PCR (Williams, J. G., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V., DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18, 6531-6535 (1990); Welsh, J. and McClelland M., Genomic fingerprinting using arbitrarily primed PCR and a matrix of pairwise combinations of primers. Nucleic Acids Res., 19, 5275-9 (1991)). When applied to mRNA, samples are first reverse transcribed into cDNA and then amplified with a combination of arbitrary and specific labelled primers (Froussard, P., A random-PCR method (rPCR) to construct whole cDNA library from low amounts of RNA. Nucleic Acid Res. 20, 2900 (1992); Welsh, J. et al., Arbitrarily primed PCR fingerprinting of RNA. Nucleic Acids Res., 20, 4965-70 (1992)). The resulting labeled DNA fragments are then electrophoresed through a gel producing a "banding pattern" or "fingerprint" of the mRNA source and run in separate gel lanes (Liang, P. and Pardee, A. B., Differential Display of Eukaryotic Messenger RNA by Means of the Polymerase Chain Reaction. Science, 257, 967-971 (1992)). Difference in gene expression are then found by manually comparing the fingerprints obtained from two mRNA sources. Following this fragments of interest are extracted from the gel. This method is severely limited by its reliance on autoradiographic methods to allow for the isolation of the genes of interest. Refinements of PCR based techniques have, however, led to the ability to produce more reproducible banding patterns, and to the use of an automated DNA sequencing machine to record the banding patterns produced with fluorescently labeled primers (Liang, P., Averboukh, L. and Pardee A. B., Distribution and cloning of eukaryotic mRNAs by means of differential display: refinements and optimization. Nucleic Acids Res. 21, 3269-3275 (1993)). However, commercial automatic sequencing instruments (Applied Biosystems Inc., Foster City, Calif. DNA sequencer) do not allow for the resolution of many dye labels or allow for the isolation of the fluorescently labeled samples after they are run. In an automated machine the sample is simply lost. Arbitrary primed PCR methods would be much more attractive if their limitations could be addressed.
To address these limitations, our invention allows these gene fragments to be detected fluorescently and to be directly isolated, without human intervention, as they are identified. This is accomplished by electrophoretically separating the individual bands, and hence the differentially expressed genes, from the rest of the sample as it is running. This approach incorporates the advantages of the PCR based methods to differential screening, while raising the level of speed, sensitivity and resolution well beyond that achievable with radiographic techniques. To insure high separation resolution, it is advantageous for the gel throughout a migration lane to be kept as uniform as possible and for the lanes to be sufficiently separated to be clearly distinguishable.
To achieve these required improvements in the analysis capacity for DNA and for other biopolymers, machines are needed for the rapid, concurrent analysis of large numbers of minute biopolymer samples. Further, the analysis must be done with minimal human intervention and at low cost. Since electrophoresis will remain the dominant biological separation technology for the foreseeable near future, the technical demands of more rapid electrophoresis will shape the design of such machines.
More rapid electrophoresis requires, primarily, higher voltages and stronger electric fields to exert greater forces on migrating molecules and move them at greater velocities. However, higher fields and velocities lead to increased resistive heating and consequent thermal gradients in the gel. Gel non-uniformities result, impairing separation resolution. To preserve resolution, ever smaller gel geometries must be used so that this damaging heat may be more readily conducted away. Moreover, parallel, narrow migration lanes are advantageous to increase the number of samples analyzed simultaneously. While electrophoresis has been described in geometries where the parallel glass plates are spaced from 25 to 150 .mu.m apart, instead of the usual 400 .mu.m, it is not possible to insure long, parallel, narrow, and closely spaced migration lanes in such a thin sheet. Alternatively, electrophoresis has been described in arrays of capillary tubes down to 25 .mu.m in diameter which completely define migration lanes. However, although the conventional plate arrangement is relatively easy to load with gel and samples, arrays of capillary tubes are much more difficult to load. Easy loading is advantageous to minimize analysis setup time and human intervention.
The small geometries required by high resolution, high voltage electrophoretic analysis create additional technical demands. Where fluorescent dye fragment labeling is used, sensitive spectral detection devices are needed. These detection devices must respond quickly, since rapid migration presents fragment samples for detection with only slight time separation. Most significantly, rapid parallel analysis of many biopolymer samples requires the detection device to simultaneously detect fragments migrating in separate lanes. Conventional detectors cannot meet these demands. One design uses rotatable filters to select spectral ranges to present to a single active detector element, this assembly being scanned mechanically across all the migration lanes. However, such mechanical single detector assemblies waste most of the available fluorescence energy from the fragment samples, limit detection speed, prohibit simultaneous detection, and slow sample analysis. Use of spectrally fixed filters also limits dynamic adaptation to different detection labels.
While a spatially compact disposition of the migration lanes might permit simultaneous observation, sample loading into the migration lanes prior to an analysis run requires physical access to the migration lanes. Access is easier and more rapid for widely spaced lanes. Conventional, flat-plate techniques have only straight, parallel lanes and cannot accommodate these divergent requirements.
A high throughput analysis machine would generate voluminous detection data representing the rapidly migrating biopolymer fragment samples. Manual analysis of such data is not feasible. To minimize human post analysis checking, these methods should achieve accuracies of 99% or greater. Further, the data would contain fragment detection events closely spaced, even overlapping, in time. Moreover, small electrophoretic geometries and small fragment sizes would generate only weak signals with increased noise. Prior electrophoretic devices, on the other hand, generated only clearly separated detection events with good signal intensities.
Once fragment events are discriminated, the entire data for a run must be assembled to determine the nature of the original biopolymer sample. For DNA sequencing, this is conventional: the bases and their order in the DNA sample are the terminating bases of the fragments in the order of increasing molecular weight. When sequencing on a genomic scale, the bases and their order must be assembled into an ordered listing of the bases of the genome of the organism being studied.
All the foregoing technical requirements have prevented creation of an integrated machine for rapid, concurrent generation and analysis of large number of biopolymer fragment samples. The need for such a machine is widely felt in such areas as biological research, for example the Human Genome Project, the biotechnology industry and clinical diagnosis.