“Sequencing-by-synthesis” of the type known as “sequencing using cycle reversible termination” (SuCRT) is a strategy that extends a primer by template-directed addition of one nucleotide at a time, using a nucleoside triphosphate or thiotriphosphate as a source of the added building block. Polymerization is stopped for a time after each nucleotide incorporated. In that time, the extended primer is examined to determine what nucleotide is incorporated, and to infer the nucleotide in the template that directed the incorporation.
One mechanism to cause polymerization to stop is to have its 3′-hydroxyl group blocked by a removable protecting (or blocking) group. This blocking group prevents the polymerase from adding additional nucleotides until the blocking group is removed. In practice, this provides an arbitrarily long time to determine the nature of the added nucleotide.
One strategy to determine the identity of the nucleotide added is to have each nucleotide carry a fluorescent tag, where the color of the fluorescence emission is distinctive for the type of nucleotide. After extension, but before removing the blocking group, the nature of the nucleotide incorporated is determined by reading the fluorescence from the tag. After this is done, the tag and the 3′-protecting group are removed, and the next cycle of sequencing is initiated. In this architecture, template-directed polymerization is done using a DNA polymerase or, a reverse transcriptase.
When the output is fluorescence, this implementation of the strategy requires:    (a) Four analogues of dATP, dTTP, dGTP, and dCTP, each carrying a fluorescent dye with a′ different color, with the 3′-end blocked so that immediate elongation is not possible.    (b) The four analogues must be incorporated to allow the elongation reaction to be completed before undesired reactions occur and avoid ragged ends from incomplete incorporation.    (c) The incorporation must be substantially faithful. Mismatched incorporation, if not corrected by proofreading, will lead to the loss of strands if the polymerase does not extend efficiently a terminal mismatch. This will gradually erode the intensity of the signal, and may generate “out of phase” signals that confuse the reading of the output downstream.    (d) The dye and the group blocking the 3′-OH group must be cleaved with high yield to allow the incorporation of the next nucleotide of the next nucleotide to proceed. Incomplete cleavage will erode the intensity of the signal or generate “out of phase” signals that confuse downstream reading. For single molecule sequencing, failure to cleave the 3′-OH blocking group may lose a cycle of sequence data collection.    (e) The growing strand of DNA should survive the washing, detecting and cleaving processes. While reannealing is possible, conditions that allow the DNA primer and template to remain annealed are preferable.
In their most ambitious forms, sequencing-by-synthesis architectures would use the same nucleoside modification to block the 3′-end of the DNA and to introduce the fluorescent tag [We199]. For example, if a fluorescent tag is attached to the 3′-position via an ester linkage, replacing the hydrogen atom of the 3′-OH group of the nucleoside triphosphate, extension following incorporation would not be possible (there is no free 3′-OH group). This would give time to read the color of the fluorescent label, determining the nature of the nucleotide added. Then, the 3′-O acyl group could be removed by treatment with a mild nucleophile (such as hydroxylamine) under mild conditions (pH<10) to regenerate a free 3′-hydroxyl group, preparing the DNA for the next cycle.
The difficulty in implementing this elegant approach is the polymerases themselves. Any tag that fluoresces in a useful region of the electromagnetic spectrum must be large, on the order of 1 nm. Crystal structures of polymerases show that the 3′-position in the deoxyribose unit is close to amino acid residues in the active site of the polymerase, and do not offer the incoming triphosphate the space to accommodate a tag of that size. The polymerase, therefore, is not likely to be able to handle substituents having a tag of this size at the 3′-position. Indeed, polymerases do not work well with any modification of the 3′-OH group of the incoming triphosphate. For example, to accept even 2′,3′-dideoxynucleoside analogues (where the 3′-moiety is smaller than in the natural nucleoside), mutated polymerases are often beneficial.
Ju et al., in U.S. Pat. No. 6,664,079, noted these problems as they outlined a proposal for SuCRT based on various 3′-OH blocking groups. They suggested that a fluorescent or mass tag could be attached via a cleavable linker to a point on the nucleoside triphosphate other than on the 3′-OH unit (FIG. 1). This linker could be attached (without limitation) to the 5-position of the pyrimidines (T and C) and the 7-position of the purines (G and A). According to U.S. Pat. No. 6,664,079, tags at this position should, in principle, allow the 3′-OH group to be blocked by a cleavable moiety that is small enough to be accepted by DNA polymerases. In this architecture, multiple cleavage steps might be required to remove both the tag (to make the system clean for the addition of the next tag) and the 3′-blocking group, to permit the next cycle of extension to occur [Mit03][Seo04].
U.S. Pat. No. 6,664,079 struggled to find a small chemical group that might be accepted by polymerases, and could be removed under conditions that were not so harsh as to destroy the DNA being sequences. U.S. Pat. No. 6,664,079 cited a literature report that 3′-O-methoxy-deoxynucleotides are good substrates for several polymerases [Axe78]. It noted, correctly, that the conditions for removing a 3′-O methyl group were too stringent to permit this blocking group from being removed under any conditions that were likely to leave the DNA being sequenced, or the primer that was being used, largely intact.
An ester group was also discussed as a way to cap the 3′-OH group of the nucleotide. U.S. Pat. No. 6,664,079 discarded this blocking group based on a report that esters are cleaved in the active site in DNA polymerase [Can95]. It should be noted that this report is questionable, and considers only a single polymerase. Nevertheless, ester linkages are susceptible to spontaneous hydrolysis in water, especially if they are small (such as the formyl group).
Chemical groups with electrophiles such as ketone groups were also considered and discarded by U.S. Pat. No. 6,664,079 as not being suitable for protecting the 3′-OH of the nucleotide in enzymatic reactions. Polymerases have nucleophilic centers (such as amino groups) in the polymerase that were proposed to react with the amino groups of proteins. In fact, this is unlikely (cyclopentanone, for example, does not form appreciable amounts of imine with protein side chains). However, a 3′-keto 2′-deoxyribose unit in a nucleoside is not stable to decomposition via beta elimination reactions, as is well known in the literature studying the mechanism of ribonucleotide reductases.
U.S. Pat. No. 6,664,079 then cited a literature report that 3′-O-allyl-dATP is incorporated by Vent (exo-) DNA polymerase in the growing strand of DNA [Met94]. U.S. Pat. No. 6,664,079 noted that this group, and the methoxymethyl MOM group, having a similar size, might be used to cap the 3′-OH group in a sequencing-by-synthesis format. This patent noted that these groups can be cleaved chemically using transition metal reagents [Ire86][Kam99], or through acidic reagents (for the MOM group).
These suggestions therefore define the invention proposed in U.S. Pat. No. 6,664,079. Briefly, the essence of this invention is an architecture where the triphosphates of four nucleotide analogues, each labeled with a distinctive cleavable tag attached to the nucleobase, and each having the hydrogen of the 3′-OH group capped replaced by an ally! group or a MOM group, are used as the triphosphates in the sequencing by synthesis architecture, and the products are oligonucleotides prepared by polymerase incorporation that have this replacement.
Unfortunately, various other aspects of a practical tool for sequencing using cyclic reversible termination were not anticipated by U.S. Pat. No. 6,664,079, and are not enabled in the prior art. In particular, the cleavage reaction that removes the fluorescent tag may not restore the nucleobase to its natural structure, leaving behind what is known in the literature as a “scar”. It is a question open to experimentation as to whether a primer whose 3′-nucleotide carries a scar will be extended in a template-directed polymerization reaction by an incoming triphosphate that carries both a 3′-O blocking group and a fluorescently tagged nucleobase. While architectures are easily conceived that use mixtures of fluorescently tagged and untagged triphosphates to implement a sequencing using cyclic reversible termination strategy, it would be preferable to identify polymerases that will add a 3′-blocked fluorescently tagged nucleotide to a scarred primer.