Sequencing by synthesis methods are commonly used in next generation sequencing (NGS) technologies. Nucleotide strands complementary to a target polynucleotide fragment are extended by incorporation of nucleotides (eg, dNTPs) by a polymerase enzyme, and the incorporation is detected; for example, by fluorescence or by detection of hydrogen ions released during polymerisation. This latter technique is used in ion semiconductor sequencing methods. Incorporation of a given dNTP into a strand means that the complementary nucleotide is present at that position in the template strand.
In some techniques, different nucleotides are given different detectable labels, so that the specific nucleotide incorporated can be determined. However, an alternative approach is simply to add a single type of nucleotide at a time to the polymerase reaction; if incorporation of the nucleotide is detected, then the complementary nucleotide in the template strand is known. Typically a sequencing reaction will cycle through all four nucleotides in order, and repeat this for the duration of the sequencing. However, this imposes time limitations on the process, since it is necessary to repeat the cycle multiple times in order to obtain the sequence, and depending on the order of nucleotides in the template strand, as much as four nucleotide flows may be necessary to obtain information on a single base.
However, for many applications of sequencing by synthesis technology, the expected sequence of the template is known, or at least partially known. For example, a patient sample may be analysed for the presence of a suspected pathogen, whereby a sequence diagnostic for a given pathogen is detected. In this example, the sequence to be detected is already known. Alternatively, for example, variants in certain gene sequences may be detected in order to determine the presence or absence of a given polymorphism or mutation. Here again at least a portion of the sequence is known. In certain applications of sequencing by synthesis, polynucleotide fragments are prepared for sequencing by ligating or otherwise incorporating adapters of known sequence, to which sequencing primers can bind. At least this part of the sequencing reaction may benefit from the knowledge of the region to be sequenced.
US2014/0031238 describes use of alternate nucleotide flow ordering which is not simply a continuous repeat of all four nucleotides. This alternate ordering is said to address potential problems with loss of phasic synchrony resulting from incomplete extension. There is no suggestion that the order of nucleotide flow may be modified by taking advantage of the existence of known sequences.
It would be advantageous to provide a sequencing by synthesis method whereby the order of nucleotide flow may be improved or optimised. In certain embodiments this is achieved using a priori knowledge of the sequence to be detected. In other embodiments, likely candidate sequences may be selected, and the nucleotide flow determined based on the likelihood of certain sequences being present. In yet further embodiments, a feedback mechanism may be used to modify the nucleotide flow during sequencing by synthesis.