1. Field
The present disclosure pertains to the fields of genomics and molecular biology, and, more specifically, to the field of nucleic acid sequencing.
2. Description of Related Art
Genetic information in living organisms is contained in the form of very long nucleic acid molecules such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Naturally occurring DNA and RNA molecules are typically composed of repeating chemical building blocks called nucleotides, which are in turn made up of a sugar (deoxyribose or ribose, respectively), phosphoric acid, and one of four bases, adenine (A), cytosine (C), guanine (G), and thymine (T) or uracil (U).
The human genome, for example, contains approximately three billion nucleotides of DNA sequence. DNA sequence information can be used to determine genetic characteristics of an individual, including the presence of and or suceptibility to many common diseases, such as cancer, cystic fibrosis, and sickle cell anemia. Determination of the entire three billion nucleotide sequence of the human genome has provided a foundation for identifying the genetic basis of such diseases. A determination of the sequence of the human genome required years to accomplish. Sequencing the genomes of individuals provides an opportunity to personalize medical treatments. The need for nucleic acid sequence information also exists in clinical applications, such as for example, pathogen detection (the detection of the presence or absence of pathogens or their genetic variants), and in research in environmental protection, food safety, bio-defense, and other areas.
A typical method for nucleic acid sequencing involves producing many copies of a gene, cutting it into overlapping fragments, determining the sequences of individual fragments, collecting the data, and analyzing the data to assemble the sequences of the individual fragments into the sequence of the gene. Due to the large amount of data required to sequence genes and other functional units of DNA, the data analysis and assembly might be performed as a separate process, only after many separate sequencing processes have been completed.