The human haploid genome contains 3 billion base pairs packaged in 23 chromosomes, and the diploid genome has 6 billion base pairs in 23 pairs of chromosomes. The rapidity and convenience of modern sequencing technology enables many diagnostic questions to be approached using high-throughput sequencing of an individual's entire genome or of the full quantity of DNA in a sample. However, for many DNA diagnostics applications, it is only necessary to investigate a subset of the genome, focussing on the region or regions known to be associated with the particular disorders under investigation.
A number of techniques have been described for reducing the complexity of the genome before analysis. Where only a single, short region of the genome is required to be analysed, this may be done using straightforward PCR to amplify the sequence using primers to known regions on either side. However, when it is desired to amplify many regions of a genomic sample for analysis, amplification artefacts can arise as a result of performing multiple different amplifications together in the same reaction mixture.
WO2003/044216 (Parallele Bioscience, Inc.) and US20090004701A1 (Malek Faham) described a method of multiplex amplification of target nucleic acids, in which common oligonucleotide primers were ligated to sites internal to single-stranded nucleic acid fragments. The common priming sites were appended to each of a plurality of different target sequences to allow their stoichiometric amplification.
WO2005/111236 (Olink AB) also described a method of identifying sequences in the human genome by amplifying specific target sequences. The method involved fragmenting the genomic sample into fragments having at least one defined end sequence. Selector constructs, all comprising a primer pair motif, were brought in contact with the fragments. After ligation, the selected target sequences were amplified in parallel using a primer-pair specific for the primer-pair motif common to the selectors. The selector constructs described in WO2005/111236 had a long oligonucleotide hybridised to a short oligonucleotide, each selector construct having one or two protruding ends complementary to a defined end sequence of a fragment containing the target sequence. Contacting the selectors with the target fragments resulted in hybridisation of the target fragment between protruding ends of the selector or selectors. In the case of a single selector with two protruding ends this hybridisation produced a circularised construct. In the case of a pair of selectors each with one protruding end this formed a linear construct. Ligation and sequencing of the selector constructs containing the target fragments allowed the target sequence to be determined. Since the selector constructs hybridise only to the end portions of the fragment containing the target sequence (or to one end portion and one internal portion), the method allowed selection of target sequences that differed in the non-hybridising portions, so that each selector molecule could hybridise to a variety of different target sequences. The identity of the exact target was then determined by amplifying and sequencing the constructs. WO2005/111236 proposed using the selectors in methods of analysing genetic variability or for DNA copy number measurements.
GB2492042 described a variation of the selector method, in which the fragments were contacted with a partially double-stranded probe comprising a selector oligonucleotide and at least one vector oligonucleotide. The selector oligonucleotide contained two non-adjacent regions specific for the target fragment and a non-target specific region which comprised at least two binding sites for the vector oligonucleotide. The vector oligonucleotide was not complementary to the target sequence, and included a nucleotide sequence complementary to the vector-binding site on the selector oligonucleotide. The vector oligonucleotide also contained elements for detection/enrichment. In the method, complementary portions of the probe oligonucleotides were hybridised to the target fragment, followed by ligating the vector oligonucleotide(s) and target to produce a probe-target fragment hybrid, which was then detected.
A development of the selector technology was described in WO2011/009941 (Olink Genomics AB), describing ligation of one end of a fragment of digested genomic DNA to a probe. Compared with the earlier selector probes, which involved binding to two regions of the target fragment and where the sequence to be isolated was typically bounded by two regions of known sequence, the probes in WO2011/009941 were described for use where there was only one known region of sequence. Some embodiments of the probes in WO2011/009941 contained elements for immobilisation to a solid phase. Ligation of the target nucleic acid fragment to the probe resulted in a stable capture of the target fragment and allowed the use of highly stringent washing steps to remove non-ligated fragments, resulting in a high specificity.
Also known are padlock probes. Padlock probes are linear oligonucleotides with target complementary sequences at the ends and a non-target complementary sequence in between. When hybridised to the correct target DNA sequence, the two ends of the probe are brought together head to tail and can be joined by DNA ligase. Ligation is inhibited by mismatches at the ligation junction, so successful ligation of the padlock probe depends on highly specific hybridisation to the target sequence, allowing the probe to distinguish between highly similar target sequences and selectively padlock its exact target. As a consequence of the helical nature of double stranded DNA, the circularised probe molecule is catenated to the target DNA strand.
It was known to amplify the circularised padlock probes using rolling circle replication, also known as rolling circle amplification. Rolling circle replication was described in U.S. Pat. No. 5,854,033 (Lizardi). Rolling circle replication is an amplification of a circular nucleic acid molecule using a strand displacing DNA polymerase, resulting in large DNA molecules containing tandem repeats of the amplified sequence. The DNA polymerase catalyses primer extension and strand displacement in a processive rolling circle polymerisation reaction that proceeds as long as desired. It results in an amplification of the circularised probe sequence orders of magnitude higher than a single cycle of PCR replication and other amplification techniques in which each cycle is limited to a doubling of the number of copies of a target sequence. Additional amplification can be obtained using a cascade of strand displacement reactions.
Fredriksson et al. (Nucleic Acids Res. 35(7):e47 2007) described “Gene-Collector”, a method for multiplex amplification of nucleic acids using collector probes which contain adjacent sequences complementary to the cognate primer end sequences of desired PCR products, so that binding of the collector probes to the PCR products brings the ends of the PCR products together to form a DNA circle. Universal amplification is then performed using rolling circle amplification to generate a final product of concatamers of target sequences. This method allows the correct amplicons in a multiplex PCR reaction to be selectively detected, because the end sequences of the correct amplicons are a cognate primer pair and are circularised by the collector probe, whereas PCR artefacts combining a primer from one pair with a primer from another pair are not circularised.