The advent of large scale parallel nucleic acid sequencing has made the identification of sequence variation within complex populations feasible. Rolling circle amplification (RCA), an amplification process which utilizes a polymerase possessing strand displacement abilities, has emerged as a useful alternative and supplement to polymerase chain reaction (PCR) procedures for preparing nucleic acids for sequencing analysis. RCA involves growing a polynucleotide with a repetitive sequence by continuously adding nucleotides to a primer annealed to a circular polynucleotide template, such as a circular DNA template. This extension process can cover the entire length of the circular polynucleotide template multiple times, resulting in the formation of repeated sequences of the template or what can be referred to as concatemers. Concatemers can also serve as template to generate further amplification products. This extension, however, proceeds only until the terminus of the linear concatemer is reached. As the front of the growing polynucleotide strand encounters a double-stranded portion of DNA, the growing strand displaces the existing strand from the template. The result is often the formation of various lengths of double-stranded DNA consisting of a variable number of repeats of the template sequence. In the conventional methods of RCA, short concatemers are more often amplified disproportionally compared to longer concatemers that contain many repeats of a target sequence. Subsequent analyses of the longer concatemers may therefore be more difficult.
Large scale parallel sequencing has significant limitations in that the inherent error frequency in commonly-used techniques is larger than the frequency of many of the actual sequence variations in the population. For example, error rates of 0.1-1% have been reported in standard high throughput sequencing. Detection of rare sequence variants has high false positive rates when the frequency of variants is low, such as at or below the error rate.
The ability to detect rare sequence variants is pivotal for a variety of reasons. For example, detecting rare characteristic sequences can be used to identify and distinguish the presence of a harmful environmental contaminant, such as bacterial taxa. A common way of characterizing bacterial taxa is to identify differences in a highly conserved sequence, such as rRNA sequences. However, typical sequencing-based approaches to this date are faced with challenges relating to the sheer number of different genomes in a given sample and the degree of homology between members, presenting a complex problem for already laborious procedures.
The existing techniques for detecting sequence variations are particularly ineffective in detecting fusion gene variations and chromosome rearrangements. Often the ‘partner’ gene fused with the rearranged gene is not known, which makes the detection challenging. Fusions genes may also be difficult to detect if the junction site is not observed.