Nucleic acid sequences encode the necessary information for living things to function and reproduce, and are essentially a blueprint for life. Determining such sequences is therefore a tool useful in pure research into how and where organisms live, as well as in applied sciences such drug development. In medicine, sequencing tools can be used for diagnosis and to develop treatments for a variety of pathologies, including cancer, heart disease, autoimmune disorders, multiple sclerosis, or obesity. In industry, sequencing can be used to design improved enzymatic processes or synthetic organisms. In biology, such tools can be used to study the health of ecosystems, for example, and thus have a broad range of utility.
An individual's unique DNA sequence provides valuable information concerning their susceptibility to certain diseases. The sequence will provide patients with the opportunity to screen for early detection and to receive preventative treatment. Furthermore, given a patient's individual blueprint, clinicians will be capable of administering personalized therapy to maximize drug efficacy and to minimize the risk of an adverse drug response. Similarly, determining the blueprint of pathogenic organisms can lead to new treatments for infectious diseases and more robust pathogen surveillance. Whole genome DNA sequencing will provide the foundation for modern medicine. Sequencing of a diploid human genome requires determining the sequential order of approximately 6 billion nucleotides. Sequencing of RNA can also provide valuable information relating to which portions of the genome are being expressed by single cells or groups of cells. Greater knowledge of expression can provide keys to understanding and treating many diseases and conditions, including providing a molecular level understanding of the progression of cancer.
A variety of methods have been developed with the goal of providing efficient, cost effective, accurate, and high throughput sequencing. Single-molecule nucleic acid sequencing-by-synthesis is a sequencing method that has the potential to revolutionize the understanding of biological structure and function. While such sequencing methods have been shown to provide reliable sequencing information, further improvements in the quality of sequencing information is desired. For example, in current sequencing-by-synthesis methods, errors in sequencing can occur that lead to incorrect base calling. The present invention provides systems, compositions, and methods of for improving the quality of nucleic acid sequence information.