Nucleic acids are formed by chains of linked units called nucleotides. Nucleotides are molecules that are joined to create structural units of the nucleic acids ribonucleic acid (RNA) and deoxyribonucleic acid (DNA). A nucleotide includes a phosphate group, a sugar (ribose in the case of RNA and deoxyribose for DNA) and a nucleobase. The nucleobases are used in base pairing of strands of nucleotides to form higher-level structures such as the well-known double helix. The four bases found in DNA are adenine (A), cytosine (C), guanine (G) and thymine (T). In a DNA double helix, each type of nucleobase on one strand normally interacts with just one type of nucleobase on the other strand, which is known as complementary base pairing. Specifically, A only bonds to T and C only bonds to G. The RNA nucleobases include uracil (U) instead of thymine. Because of the importance of DNA and RNA, knowledge of a DNA or RNA sequence is useful for many purposes including, for example, to identify, diagnose and develop treatments for pathological, contagious or genetic diseases.
Nucleic acid sequencing chemistries include sequencing-by-synthesis (SBS) or sequencing-by-ligation (SBL) strategies. These strategies typically use random or ordered two-dimensional (2D) arrays for tracking sequence identity data. These array densities are extremely high, ranging from 105 or 107 features (or higher for single molecule detection). As the nucleotide chain grows from the action of the polymerase (SBS) or ligase (SBL), labels are incorporated and detected by readers. When a base or nucleic acid associated with a label is identified, the base is assigned a feature on the array by capturing a 2D optical image. However, the optical resolution needed to separate spectral data from these high densities arrays requires very long exposure times, resulting in average run times of hours to several days. In addition, the optical images acquired from successive sequencing cycles can easily reach the terabyte size, which creates a huge demand on algorithm computation time and data storage.