DNA Sequencing
Current knowledge regarding gene structure, the control of gene activity and the function of cells on a molecular level all arose based on the determination of the base sequence of millions of DNA molecules. DNA sequencing is still critically important in research and for genetic therapies and diagnostics, (e.g., to verify recombinant clones and mutations).
DNA, a polymer of deoxyribonucleotides, is found in all living cells and some viruses. DNA is the carrier of genetic information, which is passed from one generation to the next by homologous replication of the DNA molecule. Information for the synthesis of all proteins is encoded in the sequence of bases in the DNA.
To obtain the genetic information and therefore to reveal the base sequence of a given DNA molecule, chemical and enzymatic sequencing methods have been developed. DNA-sequencing as proposed by Maxam-Gilbert (Maxam, A. M., W. Gilbert, Proc. Natl. Acad. Sci. USA, 74:560-564 1977) is a chemical method of determining base composition of a nucleic acid molecule. A single stranded DNA molecule with radioactive label at its 5' end is chemically modified in four base specific reactions and then cleaved at the modified positions. The cleavage products are separated on a polyacrylamide gel and typically are detected by autoradiography.
Currently favoured is the enzymatic chain termination reaction according to the Sanger-sequencing method (Sanger, F. et. al., Proc. Natl. Acad. Sci. USA, 74:5463-5467 1977). In the Sanger method, the four base specific sets of DNA fragments are formed by starting with a primer/template system elongating the primer into the unknown DNA sequence area and thereby copying the template and synthesizing complementary strands using a DNA polymerase in the presence of chain-terminating reagents. The chain-terminating event is achieved by incorporating into the four separate reaction mixtures in addition to the four normal deoxynucleoside triphosphates, dATP, dGTP, dTTP and dCTP, only one of the chain-terminating dideoxynucleoside triphosphates, ddATP, ddGTP, ddTTP or ddCTP, respectively in a limiting small concentration. The incorporation of a ddNTP lacking the 3' hydroxyl function into the growing DNA strand by the enzyme DNA polymerase leads to chain termination through preventing the formation of a 3'-5'-phosphodiester bond by DNA polymerase. Due to the random incorporation of the ddNTPs, each reaction leads to a population of base specific terminated fragments of different lengths, which all together represent the sequenced DNA-molecule.
A recent modification of the Sanger sequencing strategy involves the degradation of phosphorothioate-containing DNA fragments obtained by using alpha-thio dNTP instead of the normally used ddNTPs during the primer extension reaction mediated by DNA polymerase (Labeit et al., DNA 5, 173-177 (1986); Amersham, PCT-Application GB86/00349; Eckstein et al., Nucleic Acids Res. 16, 9947 (1988)). Here, the four sets of base-specific sequencing ladders are obtained by limited digestion with exonuclease III or snake venom phosphodiesterase, subsequent separation on PAGE and visualization by radioisotopic labeling of either the primer or one of the dNTPs. In a further modification, the base-specific cleavage is achieved by alkylating the sulphur atom in the modified phosphodiester bond followed by a heat treatment (Max-Planck-Gesellschaft, DE 3930312 A1).
DNA Amplification
DNA can be amplified by a variety of procedures including cloning (Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, 1989), polymerase chain reaction (PCR) (C. R. Newton and A. Graham, PCR, BIOS Publishers, 1994), ligase chain reaction (LCR) (F. Barany Proc. Natl. Acad. Sci USA 88, 189-93 (1991), strand displacement amplification (SDA) (G. Terrance Walker et al., Nucleic Acids Res. 22, 2670-77 (1994)) and variations such as RT-PCR, allele-specific amplification (ASA) etc.
The polymerase chain reaction (Mullis, K. et al., Methods Enzymol., 155:335-350 1987) permits the selective in vitro amplification of a particular DNA region by mimicking the phenomena of in vivo DNA replication. Required reaction components are single stranded DNA, primers (oligonucleotide sequences complementary to the 5' and 3' ends of a defined sequence of the DNA template), deoxynucleotidetriphosphates and a DNA polymerase enzyme. Typically, the single stranded DNA is generated by heat denaturation of provided double strand DNA. The reaction buffers contain magnesium ions and co-solvents for optimum enzyme stability and activity.
The amplification results from a repetition of such cycles in the following manner: The two different primers, which bind selectively each to one of the complementary strands, are extended in the first cycle of amplification. Each newly synthesized DNA then contains a binding site for the other primer. Therefore each new DNA strand becomes a template for any further cycle of amplification enlarging the template pool from cycle to cycle. Repeated cycles theoretically lead to exponential synthesis of a DNA-fragment with a length defined by the 5' termini of the primer.
Initial PCR experiments used thermolabile DNA polymerase. However, thermolabile DNA polymerase must be continually added to the reaction mixture after each denaturation cycle. Major advances in PCR practice were the development of a polymerase, which is stable at the near-boiling temperature (Saiki, R. et al., Science 239:487-491 1988) and the development of automated thermal cyclers.
The discovery of thermostable polymerases also allowed modification of the Sanger sequencing reaction with significant advantages. The polymerization reaction could be carried out at high temperature with the use of thermostable DNA polymerase in a cyclic manner (cycle sequencing). The conditions of the cycles are similar to those of the PCR technique and comprise denaturation, annealing, and extension steps. Depending on the length of the primers only one annealing step at the beginning of the reaction may be sufficient. Carrying out a sequencing reaction at high temperature in a cyclic manner provides the advantage that each DNA strand can serve as template in every new cycle of extension which reduces the amount of DNA necessary for sequencing, thereby providing access to minimal volumes of DNA, as well as resulting in improved specificity of primer hybridisation at higher temperature and the reduction of secondary structures of the template strand.
However, amplification of the terminated fragments is linear in conventional cycle sequencing approaches. A recently developed method, called semi-exponential cycle sequencing shortens the time required and increases the extent of amplification obtained from conventional cycle sequencing by using a second reverse primer in the sequencing reaction. However, the reverse primer only generates additional template strands if it avoids being terminated prior to reaching the sequencing primer binding site. Needless to say, terminated fragments generated by the reverse primer can not serve as a sufficient template. Therefore, in practice, amplification by the semi-exponential approach is not entirely exponential. (Sarkat, G. and Bolander Mark E., Semi Exponential Cycle Sequencing Nucleic Acids Research, 1995, Vol. 23, No. 7, p. 1269-1270).
As pointed out above, current nucleic acid sequencing methods require relatively large amounts (typically about 1 .mu.g) of highly purified DNA template. Often, however, only a small amount of template DNA is available. Although amplifications may be performed, amplification procedures are typically time consuming, can be limited in the amount of amplified template produced and the amplified DNA must be purified prior to sequencing. A streamlined process for amplifying and sequencing DNA is needed, particularly to facilitate highthroughput nucleic acid sequencing.