The development of reliable methods for sequence analysis of DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) has been one of the keys to the success of recombinant DNA and genetic engineering. When used with the other techniques of modern molecular biology, nucleic acid sequencing allows dissection and analysis of animal, plant and viral genomes into discrete genes with defined chemical structure. Since the function of a biological molecule is determined by its structure, defining the structure of a gene is crucial to the eventual manipulation of this basic unit of hereditary information in useful ways. Once genes can be isolated and characterized, they can be modified to produce desired changes in their structure that allow the production of gene products--proteins--with different properties than those possessed by the original proteins. Microorganisms into which the natural or synthetic genes are placed can be used as chemical "factories" to produce large amounts of scarce human proteins such as interferon, growth hormone, and insulin. Plants can be given the genetic information to allow them to survive harsh environmental conditions or produce their own fertilizer.
The development of modern nucleic acid sequencing methods involved parallel developments in a variety of techniques. One was the emergence of simple and reliable methods for cloning small to medium-sized strands of DNA into bacterial plasmids, bacteriophages, and small animal viruses. This allowed the production of pure DNA in sufficient quantities to allow its chemical analysis. Another was the near perfection of gel electrophoretic methods for high resolution separation of oligonucleotides on the basis of their size. The key conceptual development, however, was the introduction of methods of generating size-nested sets of fragments cloned, purified DNA that contain, in their collection of lengths, the information necessary to define the sequence of the nucleotides comprising the parent DNA molecules.
Two DNA sequencing methods are in widespread use. These are the method of Sanger, F., Nicken, S. and Coulson, A. R. Proc. Natl. Acad. Sci. U.S.A. 74, 5463 (1977) and the method of Maxam, A. M. and Gilbert, W. Methods in Enzymology 65, 499-599 (1980).
The method developed by Sanger is referred to as the dideoxy chain termination method. In the most commonly used variation of this method, a DNA segment is cloned into a single-stranded DNA phage such as M13. These phage DNAs can serve as templates for the primed synthesis of the complementary strand by the Klenow fragment of DNA polymerase I. The primer is either a synthetic oligonucleotide or a restriction fragment isolated from the parental recombinant DNA that hybridizes specifically to a region of the M13 vector near the 3" end of the cloned insert. In each of four sequencing reactions, the primed synthesis is carried out in the presence of enough of the dideoxy analog of one of the four possible deoxynucleotides so that the growing chains are randomly terminated by the incorporation of these "dead-end" nucleotides. The relative concentration of dideoxy to deoxy forms is adjusted to give a spread of termination events corresponding to all the possible chain lengths that can be resolved by gel electrophoresis. The products from each of the four primed synthesis reactions are then separated on individuals tracks of polyacrylamide gels by the electrophoresis. Radioactive tags incorporated in the growing chains are used to develop an autoradiogram image of the pattern of the DNA in each electrophoresis track. The sequence of the deoxynucleotides in the cloned DNA is determined from an examination of the pattern of bands in the four lanes.
The method developed by Maxam and Gilbert uses chemical treatment of purified DNA to generate size-nested sets of DNA fragments analogous to those produced by the Sanger method. Single or double-stranded DNA, labeled with radioactive phosphate at either the 3' or 5' end, can be sequenced by this procedure. In four sets of reactions, cleavage is induced at one or two of the four nucleotide bases by chemical treatment. Cleavage involves a three-stage process: modification of the base, removal of the modified base from its sugar, and strand scission at that sugar. Reaction conditions are adjusted so that the majority of end-labeled fragments generated are in the size range (typically 1 to 400 nucleotides) that can be resolved by gel electrophoresis. The electrophoresis, autoradiography, and pattern analysis are carried out essentially as is done for the Sanger method. (Although the chemical fragmentation necessarily generates two pieces of DNA each time it occurs, only the piece containing the end label is detected on the autoradiogram.)
Both of these DNA sequencing methods are in widespread use, and each has several variations.
For each, the length of sequence that can be obtained from a single set of reactions is limited primarily by the resolution of the polyacrylamide gels used for electrophoresis. Typically, 200 to 400 bases can be read from a single set of gel tracks. Although successful, both methods have serious drawbacks, problems associated primarily with the electrophoresis procedure. One problem is the requirement of the use of radiolabel as a tag for the location of the DNA bands in the gels. One has to contend with the short half-life of phosphorus-32, and hence the instability of the radiolabeling reagents, and with the problems of radioactive disposal and handling. More importantly, the nature of autoradiography (the film image of a radioactive gel band is broader than the band itself) and the comparison of band positions between four different gel tracks (which may or may not behave uniformly in terms of band mobilities) can limit the observed resolution of bands and hence the length of sequence that can be read from the gels. In addition, the track-to-track irregularities make automated scanning of the autoradiograms difficult--the human eye can presently compensate for these irregularities much better than computers can. This need for manual "reading" of the autoradiograms is time-consuming, tedious and error-prone. Moreover, one cannot read the gel patterns while the electrophoresis is actually being performed, so as to be able to terminate the electrophoresis once resolution becomes insufficient to separate adjoining bands, but must terminate the electrophoresis at some standardized time and wait for the autoradiogram to be developed before the sequence reading can begin.
The invention of the present patent application addresses these and other problems associated with DNA sequencing procedures and is believed to represent a significant advance in the art. The preferred embodiment of the present invention represents a further and distinct improvement.