1. Field of the Invention
This disclosure is related to the field of devices, methods, systems and processes for capturing and amplifying targeted regions on circulating cell free DNA fragments. Specifically, for capturing and amplifying targeted regions on genomic DNA where the end points of the desired target are unknown or a portion of the end points of the desired target are known but it is unknown how much of the end point is present.
2. Description of Related Art
The completion of the decoding of the canonical genome sequences of all major model organisms, as well as the human species, has thrown open the door to elucidating the candidate genes associated with various human diseases. The application of the genetic origins of human disease can be very powerful to the understanding and development of treatments for these diseases. Examples of successful application of the genetic basis of disease in the clinic and clinical research settings includes the sequencing of candidate disease loci in targeted populations, such as the Ashkenazi Jews (Weinstein 2007), the sequencing of variants in drug metabolism genes to adjust dosage (Marsh and McLeod 2006), and the identification of genetic defects in cancer that make tumors more responsive to certain types of treatments (Marsh and McLeod 2006). Accordingly, medical re-sequencing of candidate genes in individual samples is becoming increasingly important in clinical settings and in clinical research. Medical re-sequencing requires the amplification and sequencing of many candidate genes in many patient samples. However, the ability to fully embrace the promise of the clinical-application of genetic-based research necessitates the development of new technology to lower the cost and increase the throughput of medical re-sequencing to make clinical applications more feasible.
As noted in United States Patent Application Publication No.: 2010/0129874, the entirety of which is specifically incorporated herein by reference to the extent not inconsistent with the disclosures of this patent, many of the current methods for analyzing sequence variation in a subset of the human genome generally rely on polymerase chain reaction (“PCR”) to amplify targeted sequences (Greenman, et al. 2007; Sjoblom, et al., 2006; Wood, et al., 2007). However, efforts to multiplex PCR (i.e., target many regions across multiple samples in a single process) have been hampered by a dramatic increase in mispriming events as more primer pairs are used (Fan, et al. 2006). Further, the larger number of primer pairs utilized in multiplex PCR often results in inter-primer interactions that prevent amplification (Han, et al.). Therefore, separate PCRs for each region of interest generally must be performed. (Greenman, et al., 2007; Sjoblom, et al., 2006; Wood, et al., 2007). This creates a costly approach when hundreds of individual PCRs must be performed for each sample. Further, these methods often have inherent problems with multiplicity (i.e., the number of independent capture reactions which can be performed simultaneously in a single reaction), specificity (i.e., measured as the fraction of captured nucleic acids that derive from targeted regions), and uniformity (i.e., relative abundance of targeted sequences after selective capture). Ideally, a multiplex PCR model would perform each of these performance parameters (multiplicity, specificity and uniformity) well. As noted in United States Patent Application Publication No.: 2010/0129874, this was not accomplished by the currently utilized systems. Another problem was that currently utilized multiplex PCR methods and systems required a large amount of starting DNA to supply enough template for all of the required individual PCR reactions. This was a problem since DNA can serve as a limiting factor when working with clinical samples.
Because of these problems, there was a need in the art for a multiplexed PCR method that simultaneously amplified many targeted regions from a small amount of nucleic acid. United States Patent Application Publication No.: 2010/0129874 disclosed a method for amplifying at least two different nucleic acid sequences utilizing a multiplexed nucleic acid patch PCR which, in part, responded to this need in the art.
In general, the method disclosed in United States Patent Application Publication No.: 2010/0129874 relies on two rounds of target-specific enrichment, with discrete clean-up steps between each round, to confer more specific targeting and amplification than the previously known PCR systems and methodologies. Specifically, the disclosed methods require four oligonucleotide hybridizations per locus, resulting in more specific amplification than standard multiplex PCR, which requires only two hybridizations per locus.
In the first round, targeting primer pairs are designed for each target region (i.e., specific regions of interest within genomic DNA), and a low number of PCR cycles are performed. This low cycle amplification serves two functions: 1) it defines the target regions; and 2) it differentiates the target regions from non-targeted background DNA. The primers utilized in this round are designed to include uracil instead of thiamine, and are cleaved and removed by enzymes following the initial amplification. At the end of the first round, the ends of the target region are now internal to the PCR primer sequences.
In the second round, a target-specific enrichment, “patch oligonucleotides,” are employed. The patch oligonucleotides are comprised of a string of oligonucleotides of variable length that contain, at a minimum, a sequence that is the reverse compliment to at least a portion of a sequence that defines the targeted region. Stated differently, each patch oligonucleotide is designed specifically for the ends of each target region, slowing ligation of universal adapters, and a protecting group. The patch oligonucleotides are annealed to the targeted regions and serve as a patch between targeted amplicons and universal primers. This targeting step delivers a higher level of specificity as only targeted regions can anneal with patch oligonucleotides. The universal primers, which anneal to the universal region of the patch oligonucleotides, then ligate to each target amplicon. This reaction is highly specific because thermostable ligases are sensitive to mismatched bases near the ligation junction (Barany 1991). An added level of selectivity is gained by degrading mispriming products as well as the genomic DNA with exonuclease. The selected amplicons are protected from degradation by a 3′ modification on the universal primer. This hybridization and ligation of patch oligonucleotides to primer-depleted amplicons is followed by multi-template PCR amplification with primers corresponding to the universal sequences.
In the second round, the patch oligonucleotides confer additional and very high specificity in targeting regions of interest because the ligation is dependent on sequences immediately internal to the original primers used in the initial low cycle PCR. Thus, a further level of specificity is achieved by degrading any misprimed product and genomic DNA. Stated differently, enzymatic digestion is utilized to remove all non-protected DNA including any misprimed segment from the initial limited cycle step. The cleanup ensures that only the targeted regions are loaded onto the next generation sequencer for universal amplification—all off-target amplicons are degraded. Thus, in this process, all of the targeted regions are amplified and enriched simultaneously, in one tube, start to finish.
In sum, the methods disclosed in United States Patent Application Publication No.: 2010/0129874 addressed the need in the art for a multiplexed PCR method with the ability to amplify many targeted regions from a small amount of nucleic acid, allowing for the targeting of many regions across multiple samples, thereby providing an effective solution to maximize throughput capacity of sequencers.
While an advance in the art, the methods disclosed in United States Patent Application Publication No.: 2010/0129874 are still limited. Because these methods are based on defining the ends of nucleic acid sequences, they are generally not applicable in situations where the targeted region is located in fragmented DNA, residual DNA or circulating cell free DNA (circulating cell free DNA is produced through the process of cellular apoptosis and released into circulation). Stated differently, the methods disclosed in United States Patent Application Publication No.: 2010/0129874 are only applicable if the end of at least two nucleic acid sequences to which patches can be annealed are known. Thus, these methods are not applicable to capturing DNA fragments and circulating cell free DNA—i.e., situations in which the defining ends of the nucleic acid target sequences are unknown. Amongst other applications, capturing DNA fragments and circulating cell free DNA is important to the identification of genetic defects in fetal DNA circulating in maternal blood for the diagnosis of prenatal health issues. See Lo Y M, et al., “Presence of fetal DNA in maternal plasma and serum,” Lancet, 350(9076):485-7 (Aug. 16, 1997); Palomaki G E, et al., “DNA Sequencing of Maternal Plasma to Detect Down syndrome: An International Clinical Validation,” Genet Med, Vol 13: No 11 (November 2011).