The development of reliable methods for sequence analysis of DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) has been one of the keys to the success of recombinant DNA and genetic engineering. When used with the other techniques of modern molecular biology, nucleic acid sequencing allows dissection and analysis of animal, plant and viral genomes into discrete genes with defined chemical structure. Since the function of a biological molecule is determined by its structure, defining the structure of a gene is crucial to the eventual manipulation of this basic unit of hereditary information in useful ways. Once genes can be isolated and characterized, they can be modified to produce desired changes in their structure that allow the production of gene products—proteins—with different properties than those possessed by the original proteins. Microorganisms into which the natural or synthetic genes are placed can be used as chemical “factories” to produce large amounts of scarce human proteins such as interferon, growth hormone, and insulin. Plants can be given the genetic information to allow them to survive harsh environmental conditions or produce their own fertilizer.
The development of modem nucleic acid sequencing methods involved parallel developments in a variety of techniques. One was the emergence of simple and reliable methods for cloning small to medium-sized strands of DNA into bacterial plasmids, bacteriophages, and small animal viruses. This allowed the production of pure DNA in sufficient quantities to allow its chemical analysis. Another was the near perfection of gel electrophoretic methods for high resolution separation of oligonucleotides on the basis of their size. The key conceptual development, however, was the introduction of methods of generating size-nested sets of fragments cloned, purified DNA that contain, in their collection of lengths, the information necessary to define the sequence of the nucleotides comprising the parent DNA molecules.
Two DNA sequencing methods are in widespread use. These are the method of Sanger, F., Nicken, S. and Coulson, A. R. Proc. Natl. Acad. Sci. U.S.A. 74, 5463 (1977) and the method of Maxam, A. M. and Gilbert, W. Methods in Enzymology 65, 499-599 (1980).
The method developed by Sanger is referred to as the dideoxy chain termination method. In the most commonly used variation of this method, a DNA segment is cloned into a single-stranded DNA phage such as M13. These phage DNAs can serve as templates for the primed synthesis of the complementary strand by the Klenow fragment of DNA polymerase I. The primer is either a synthetic oligonucleotide or a restriction fragment isolated from the parental recombinant DNA that hybridizes specifically to a region of the M13 vector near the 3″ end of the cloned insert. In each of four sequencing reactions, the primed synthesis is carried out in the presence of enough of the dideoxy analog of one of the four possible deoxynucleotides so that the growing chains are randomly terminated by the incorporation of these “dead-end” nucleotides. The relative concentration of dideoxy to deoxy forms is adjusted to give a spread of termination events corresponding to all the possible chain lengths that can be resolved by gel electrophoresis. The products from each of the four primed synthesis reactions are then separated on individuals tracks of polyacrylamide gels by the electrophoresis. Radioactive tags incorporated in the growing chains are used to develop an autoradiogram image of the pattern of the DNA in each electrophoresis track. The sequence of the deoxynucleotides in the cloned DNA is determined from an examination of the pattern of bands in the four lanes.
The method developed by Maxam and Gilbert uses chemical treatment of purified DNA to generate size-nested sets of DNA fragments analogous to those produced by the Sanger method. Single or double-stranded DNA, labeled with radioactive phosphate at either the 3′ or 5′ end, can be sequenced by this procedure. In four sets of reactions, cleavage is induced at one or two of the four nucleotide bases by chemical treatment. Cleavage involves a three-stage process: modification of the base, removal of the modified base from its sugar, and strand scission at that sugar. Reaction conditions are adjusted so that the majority of end-labeled fragments generated are in the size range (typically 1 to 400 nucleotides) that can be resolved by gel electrophoresis. The electrophoresis, autoradiography, and pattern analysis are carried out essentially as is done for the Sanger method. (Although the chemical fragmentation necessarily generates two pieces of DNA each time it occurs, only the piece containing the end label is detected on the autoradiogram.)
Both of these DNA sequencing methods are in widespread use, and each has several variations.
For each, the length of sequence that can be obtained from a single set of reactions is limited primarily by the resolution of the polyacrylamide gels used for electrophoresis. Typically, 200 to 400 bases can be read from a single set of gel tracks. Although successful, both methods have serious drawbacks, problems associated primarily with the electrophoresis procedure. One problem is the requirement of the use of radiolabel as a tag for the location of the DNA bands in the gels. One has to contend with the short half-life of phosphorus-32, and hence the instability of the radiolabeling reagents, and with the problems of radioactive disposal and handling. More importantly, the nature of autoradiography (the film image of a radioactive gel band is broader than the band itself) and the comparison of band positions between four different gel tracks (which may or may not behave uniformly in terms of band mobilities) can limit the observed resolution of bands and hence the length of sequence that can be read from the gels. In addition, the track-to-track irregularities make automated scanning of the autoradiograms difficult—the human eye can presently compensate for these irregularities much better than computers can. This need for manual “reading” of the autoradiograms is time-consuming, tedious and error-prone. Moreover, one cannot read the gel patterns while the electrophoresis is actually being performed, so as to be able to terminate the electrophoresis once resolution becomes insufficient to separate adjoining bands, but must terminate the electrophoresis at some standardized time and wait for the autoradiogram to be developed before the sequence reading can begin.
An oligonucleotide is a short polymer consisting of a linear sequence of four nucleotides in a defined order. The nucleotide subunits are joined by phosphodiester linkages joining the 3′ hydroxyl moiety of one nucleotide to the 5′ hydroxyl moiety of the next nucleotide. An example of an oligonucleotide is 5′ ApCpGpTpApTpGpGpCp 3′. The letters A, C, G and T refer to the nature of the purine of pyrimidine base coupled at the 1-position of deoxyribose. A, adenine; C, cytosine; G, guanine; T, thymidine. P represents the phosphodiester bond. The structure of a section of an oligonucleotide is shown below.

The single stranded oligonucleotides of this invention are further characterized by being homogenous with respect to the sequence of the nucleoside subunits and are of uniform molecular weight.
Synthetic oligonucleotides are powerful tools in modern molecular biology and recombinant DNA work. There are numerous applications for these molecules, including a) as probes for the isolation of specific genes based on the protein sequence of the gene product, b) to direct the in vitro mutagenesis of a desired gene, c) as primers for DNA synthesis on a single-stranded template, d) as steps in the total synthesis of genes, and many more, reviewed in Wm. R. Bahl et al, Prog. Nucl. Acid Res. Mol. Biol., 21, 101 (1978).
A very considerable amount of effort has therefore been devoted to the development of efficient chemical methods for the synthesis of such oligonucleotides. A brief review of these methods as they have developed to the present is found in Crockett, G. C., Aldrichimica Acta 16(3), 47–55 (1983). The best methodology currently available utilizes the phosphoramidite derivatives of the nucleosides in combination with a solid phase synthetic procedure, Matteucci et al, J. Am. Chem. Soc., 103, 3185 (1981); and Beaucage et al, M. H. Tet. Lett., 22 (20), 1858-1862 (1981). Oligonucleotides of length up to 30 bases may be made on a routine basis in this matter, and molecules as long as 50 bases have been made. Machines that employ this technology are now commercially available.
There are other reports in the literature of the derivitization of DNA. A modified nucleoside triphosphate has been developed wherein a biotin group is conjugated to an aliphatic amino group at the 5 position of uracil, Langer et al, Proc. Nat. Acad. Sci., U.S.A., 78, 6633-6637 (1981). This nucleotide derivative is effectively incorporate into double stranded DNA. Once in DNA it may be bound by anti-biotin antibody which can then be used for detection by fluorescence or enzymatic methods. The DNA which has had biotin conjugated nucleosides incorporated therein by the method of Langer et al is fragmented into smaller single and double stranded pieces which are heterogeneous with respect to the sequence of nucleoside subunits and variable in molecular weight. Draper and Gold, Biochemistry, 19, 1774-1781 (1980), reported the introduction of aliphatic amino groups by a bisulfite catalyzed transamination reaction, and their subsequent reaction with the fluorescent tag. In Draper and Gold the amino group is attached directly to the pyrimidine base. The amino group so positioned inhibits hydrogen bonding and for this reason, these materials are not useful in hybridization and the like. Chu et al, Nucleic Acid Res. 11(18), 6513-6529 (1983), have reported a method for attaching an amine to the terminal 5′ phosphate of oligonucleotides or nucleic acids.
There are many reasons to want a method for covalently attaching other chemical species to synthetic oligonucleotides. Fluorescent dyes attached to the oligonucleotides permits one to eliminate radioisotopes from the research, diagnostic and clinical procedures in which they are used, and improve shelf-life availability. As described in the assignee's co-pending application for a DNA sequencing machine (Serial No. the synthesis of fluorescent-labeled oligonucleotides permits the automation of the DNA sequencing process.
The invention of the present patent application addresses these and other problems associated with DNA sequencing procedures and is believed to represent a significant advance in the art. The preferred embodiment of the present invention represents a further and distinct improvement.