The ability to read the genetic code has opened countless opportunities to benefit humankind. Whether it involves the improvement of food crops and livestock used for food, the identification of the causes of disease, the generation of targeted therapeutic methods and compositions, or simply the better understanding of what makes us who we are, a fundamental understanding of the blueprints of life is an integral and necessary component.
A variety of techniques and processes have been developed to obtain genetic information, including broad genetic profiling or identifying patterns of discrete markers in genetic codes and nucleotide level sequencing of entire genomes. With respect to determination of genetic sequences, while techniques have been developed to read, at the nucleotide level, a genetic sequence, such methods can be time-consuming and extremely costly.
Approaches have been developed to sequence genetic material with improved speed and reduced costs. Many of these methods rely upon the identification of nucleotides being incorporated by a polymerization enzyme during a template sequence-dependent nucleic acid synthesis reaction. In particular, by identifying nucleotides incorporated against a complementary template nucleic acid strand, one can identify the sequence of nucleotides in the template strand. A variety of such methods have been previously described. These methods include iterative processes where individual nucleotides are added one at a time, washed to remove free, unincorporated nucleotides, identified, and washed again to remove any terminator groups and labeling components before an additional nucleotide is added. Still other methods employ the “real-time” detection of incorporation events, where the act of incorporation gives rise to a signaling event that can be detected. In particularly elegant methods, labeling components are coupled to portions of the nucleotides that are removed during the incorporation event, eliminating any need to remove such labeling components before the next nucleotide is added (See, e.g., Eid, J. et al., Science, 323(5910), 133-138 (2009)).
In any of the enzyme mediated template-dependent processes, the overall fidelity, processivity and/or accuracy of the incorporation process can have direct impacts on the sequence identification process, e.g., lower accuracy may require multiple fold coverage to identify the sequence with a high level of confidence.
The present invention provides methods, systems and compositions that provide for increased performance of such polymerization based sequencing methods, among other benefits.