Large-scale sequencing projects typically involve the generation of libraries of progressively smaller clones of portions of the polynucleotide whose sequence is to be determined. Genomic DNA is fragmented and inserted into yeast artificial chromosomes (YACs) or cosmids whose inserts, in turn, are fragmented and inserted into phage or plasmid vectors for sequencing, e.g. Hunkapiller et al, Science, 254: 59-67 (1991). Although large-scale sequencing projects can be carried out by either so-called "directed" or "random" strategies, both approaches involve at least one or two labor intensive steps where templates are prepared for sequencing by one or another variant of the Sanger chain-termination method.
Many proposals have been made for reducing or eliminating these labor intensive steps. For example, one directed strategy involves an initial round of sequencing with a vector-specific "universal" primer followed by repetitive cycles of synthesis of a new sequencing primer generated from the just-acquired sequence information and subsequent new sequence determination with the new primer. In such a manner, one may "walk" along a relatively large sequencing template with a succession of newly determined primers without the need to fragment and subclone the template. A drawback of such an approach is the difficulty of acquiring the new primer at each cycle for making the next round of extensions. Either the process is rendered intolerably slow while one waits for the next primer to be synthesized, or the process is rendered impractical by the need to maintain a library of primers of every possible sequence which, for example, could be more than 1.times.10.sup.9 for a primer 15 nucleotides in length. A proposal to mitigate this difficulty has been made that calls for primers that are assembled from a library of shorter oligonucleotides, such as pentamers or hexamers, e.g., Kotler et al, Proc. Natl. Acad. Sci., 90: 4241-4245 (1993); Kieleczawa et al, Science, 258: 1787-1791 (1992); and the like. But even with hexamers, a library of at least 4096 oligonucleotides is required.
Besides the problem of template preparation, as mentioned above, both directed and random approaches employ the Sanger chain-termination method of sequencing which requires the generation of sets of labeled DNA fragments, each fragment having a common origin and terminating with a known base. The sets of fragments are typically separated by high resolution gel electrophoresis, which must have the capacity of distinguishing very large fragments differing in size by no more than a single nucleotide. Unfortunately, several significant technical problems have seriously impeded efficient scale-up of Sanger-based approaches, either for accommodating longer sequences or for accommodating high-volume sequencing absent massive capital and labor investment. Such problems include i) the gel electrophoretic separation step which is labor intensive, is difficult to automate, and introduces an extra degree of variability in the analysis of data, e.g. band broadening due to temperature effects, compressions due to secondary structure in the DNA sequencing fragments, inhomogeneities in the separation gel, and the like; ii) nucleic acid polymerases whose properties, such as processivity, fidelity, rate of polymerization, rate of incorporation of chain terminators, and the like, are often sequence dependent; iii) detection and analysis of DNA sequencing fragments which are typically present in fmol quantities in spatially overlapping bands in a gel; iv) lower signals because the labeling moiety is distributed over the many hundred spatially separated bands rather than being concentrated in a single homogeneous phase, and v) in the case of single-lane fluorescence detection, the availability of dyes with suitable emission and absorption properties, quantum yield, and spectral resolvability, e.g. Trainor, Anal. Biochem., 62: 418-426 (1990); Connell et al, Biotechniques, 5: 342-348 (1987); Karger et al, Nucleic Acids Research, 19: 4955-4962 (1991); Fung et al, U.S. Pat. No. 4,855,225; and Nishikawa et al, Electrophoresis, 12: 623-631 (1991).
An important advance in sequencing technology could be made if an alternative approach was available for sequencing DNA (i) that did not require high resolution electrophoretic separations of DNA fragments, (ii) that reduced the number of templates required in large-scale sequencing projects, and (iii) that was amenable to simultaneous, or parallel, application to multiple target polynucleotides.