High throughput sequencing has become a central tool in the field of biotechnology and is revolutionizing personalized medicine. Many diseases and/or disorders are genetic in origin. Acquiring the genomic sequence of individual patients in a comprehensive, rapid and cost-effective manner enhances the ability of medical professionals to diagnose diseases or identify predispositions to diseases or other genetic-based disorders. Genomic sequence information also enhances the treatment of diseases by providing doctors with information regarding the efficacy of a given therapy for a particular individual.
One approach aimed at efficiently obtaining the complete genomic sequence of an organism is sequencing by incorporation, where the identity of the sequence of nucleotides in a template nucleic acid polymer is determined by identifying each complementary base that is added to a nascent strand being synthesized against the template sequence, as such bases are added. While detection of added bases may be a result of detecting a byproduct of the synthesis or extension reaction, e.g., detecting released pyrophosphate, in many systems and processes, added bases are labeled with fluorescent dyes that permit their detection. By uniquely labeling each base with a distinguishable fluorescent dye, one attaches a distinctive detectable characteristic to each dye that is incorporated, and as a result provides a basis for identification of an incorporated base, and by extension, its complementary base upon the template sequence.
During sequencing by incorporation, nucleotide (or nucleotide analog) incorporation events are detected in real-time as the bases are incorporated into the extension product. This can be accomplished by immobilizing the complex within an optically confined space or otherwise resolved as an individual molecular complex. Some sequencing by incorporation methods employ nucleotide analogs that include fluorescent labels coupled to the polyphosphate chain of the analog, which are then exposed to the complex. Upon incorporation, the nucleotide—along with its fluorescent label—is retained by the complex for a time and in a manner that permits the detection of a signal “pulse” from the fluorescent label at the incorporation site. Upon completion of incorporation, all but the alpha phosphate group of the nucleotide is cleaved away, liberating the label from retention by the complex, and diffusing the signal from that label.
Thus, during an incorporation event, a complementary nucleotide analog, including its fluorescent label, is effectively “immobilized” for a time at the incorporation site, and the fluorescent label is subsequently released and diffuses away when incorporation is completed. Detecting the localized “pulses” of fluorescent tags immobilized at the incorporation site, and distinguishing those pulses from a variety of other signals and background noise, allows bases to be called in real-time as they are incorporated. Further details regarding base calling during sequencing by incorporation methods are found in Tomaney et al. PCT Application Serial No. PCT/US2008/065996 METHODS AND PROCESSES FOR CALLING BASES IN SEQUENCING BY INCORPORATION METHODS, incorporated herein by reference in its entirety for all purposes.
Current real-time sequencing by incorporation methods may exhibit sub-optimal reliability and accuracy due to missed signal pulses that contribute as errors in sequencing reads. Missed pulses derive from, e.g., insufficient residence time of the analogs at an active site of the polymerase or unlabeled or broken-fluorophore nucleotide analogs. Compositions and methods for improving the reliability and accuracy of sequencing by incorporation are desirable.