Nucleic acid sequencing is important for biological research, clinical diagnostics, personalized medicine and pharmaceutical development and many other fields. Cost effective and fast sequencing is needed for many applications, such as, but not limited to for microbial or pathogen detection and identification, and genetic identification for subjects. For example, applications can include, but not be limited to paternity testing and in forensic science (Reynolds et al., Anal. Chem., 63:2-15 (1991)), for organ-transplant donor-recipient matching (Buyse et al., Tissue Antigens, 41:1-14 (1993) and Gyllensten et al., PCR Meth. Appl, 1:91-98 (1991)), for genetic disease diagnosis, prognosis, and pre-natal counseling (Chamberlain et al., Nucleic Acids Res., 16:11141-11156 (1988) and L. C. Tsui, Human Mutat., 1:197-203 (1992)), and the study of drug metabolism and oncogenic mutations (Hollstein et al., Science, 253:49-53 (1991)). In addition, the cost-effectiveness of nucleic acid analysis, such as for infectious disease diagnosis, varies directly with the multiplex scale in panel testing. Many of these applications depend on the discrimination of single-base differences at a multiplicity of sometimes closely spaced loci.
A variety of DNA hybridization techniques are available for detecting the presence of one or more selected polynucleotide sequences in a sample containing a large number of sequence regions. In a simple method, which relies on fragment capture and labeling, a fragment containing a selected sequence is captured by hybridization to an immobilized probe. The captured fragment can be labeled by hybridization to a second probe which contains a detectable reporter moiety.
Another widely used method is Southern blotting. In this method, a mixture of DNA fragments in a sample is fractionated by gel electrophoresis, and then fixed on a nitrocellulose filter. By reacting the filter with one or more labeled probes under hybridization conditions, the presence of bands containing the probe sequences can be identified. The method is especially useful for identifying fragments in a restriction-enzyme DNA digest which contains a given probe sequence and for analyzing restriction-fragment length polymorphisms (“RFLPs”).
Another approach to detecting the presence of a given sequence or sequences in a polynucleotide sample involves selective amplification of the sequence(s) by polymerase chain reaction. U.S. Pat. No. 4,683,202 and R. K. Saiki, et al., Science 230:1350 (1985). In this method, primers complementary to opposite end portions of the selected sequence(s) are used to promote, in conjunction with thermal cycling, successive rounds of primer-initiated replication. The amplified sequence(s) may be readily identified by a variety of techniques. This approach is particularly useful for detecting the presence of low-copy sequences in a polynucleotide-containing sample, e.g., for detecting pathogen sequences in a body-fluid sample.
More recently, methods of identifying known target sequences by probe ligation methods have been reported. U.S. Pat. No. 4,883,750, D. Y. Wu, et al., Genomics 4:560 (1989), U. Landegren, et al., Science 241:1077 (1988), and E. Winn-Deen, et al., Clin. Chem. 37:1522 (1991). In one approach, known as oligonucleotide ligation assay (“OLA”), two probes or probe elements which span a target region of interest are hybridized to the target region. Where the probe elements base-pair with adjacent target bases, the confronting ends of the probe elements can be joined by ligation, e.g., by treatment with ligase. The ligated probe element is then assayed, evidencing the presence of the target sequence.
In a modification of this approach, the ligated probe elements act as a template for a pair of complementary probe elements. With continued cycles of denaturation, hybridization, and ligation in the presence of pairs of probe elements, the target sequence is amplified linearly, allowing very small amounts of target sequence to be detected and/or amplified. This approach is referred to as ligase detection reaction. When two complementary pairs of probe elements are utilized, the process is referred to as the ligase chain reaction which achieves exponential amplification of target sequences. F. Barany, Proc. Nat'l Acad. Sci. USA, 88:189-93 (1991) and F. Barany, PCR Methods and Applications, 1:5-16 (1991).
Another scheme for multiplex detection of nucleic acid sequence differences is disclosed in U.S. Pat. No. 5,470,705 where sequence-specific probes, having a detectable label and a distinctive ratio of charge/translational frictional drag, can be hybridized to a target and ligated together. This technique was used in Grossman, et al., Nucl. Acids Res. 22(21):4527-34 (1994) for the large scale multiplex analysis of the cystic fibrosis transmembrane regulator gene. Jou, et al., Human Mutation 5:86-93 (1995) relates to the use of a so called “gap ligase chain reaction” process to amplify simultaneously selected regions of multiple exons with the amplified products being read on an immunochromatographic strip having antibodies specific to the different haptens on the probes for each exon.
Ligation of allele-specific probes generally has used solid-phase capture (U. Landegren et al., Science, 241:1077-1080 (1988); Nickerson et al., Proc. Natl. Acad. Sci. USA, 87:8923-8927 (1990)) or size-dependent separation (D. Y. Wu, et al., Genomics, 4:560-569 (1989) and F. Barany, Proc. Natl. Acad. Sci, 88:189-193 (1991)) to resolve the allelic signals, the latter method being limited in multiplex scale by the narrow size range of ligation probes. Further, in a multiplex format, the ligase detection reaction alone cannot make enough products to detect and quantify small amounts of target sequences. The gap ligase chain reaction process requires an additional step—polymerase extension. The use of probes with distinctive ratios of charge/translational frictional drag for a more complex multiplex will either require longer electrophoresis times or the use of an alternate form of detection.
Methods for efficiently and accurately sequencing long nucleic acid fragments are needed. There is a great need for rapid, high-throughput, and low cost sequencing technology, such as for point-of-care applications and field detection of pathogens. Further, most sequence methods do not distinguish between the multiple copies of DNA that organisms may have. For example, human genome contains DNA sequences of both maternal and paternal origin. Therefore, polymorphisms may exist at loci and provide multiple different readings at the same locus during standard sequencing methods, complicating the sequencing process. The present invention permits sequencing of large amount of genome using simple chemistry and low cost equipment that lead to significant cost reduction and increase in speed, and other related advantages as well. In addition, the present invention permits reading one copy of DNA at regions containing variations, such as single nucleotide polymorphisms (SNPs).