Information about the genotype of a subject is becoming more important and relevant for a range of healthcare decisions as the genetic basis for many diseases, disorders, and physiological characteristics is further elucidated. Medical advice is increasingly personalized, with individual decisions and recommendations being based on specific genetic information.
For cost-effective and reliable medical and reproductive counseling on a large scale, it is important to be able to correctly and unambiguously identify the allelic status for many different genetic loci in many subjects. Numerous technologies have been developed for detecting and analyzing nucleic acid sequences from biological samples. A commonly used analysis technology is sequencing. Massively parallel DNA sequencing technologies have greatly increased the ability to generate large amounts of sequencing data at a rapid pace.
As sequencing has increased the ability to probe many genomic loci at once, molecular protocols have been developed to selectively enrich for loci of interest. One such protocol uses molecular inversion probes. A molecular inversion probe is composed of a common linker sequence and two unique targeting arms that hybridize to genomic regions flanking a target. In a capture protocol, probes are tiled across a region of a nucleic acid template to ensure overlapping coverage. The hybridized probes are then filled-in with polymerase and the circularized probe is closed with ligase. Following circularization of the probes, the remaining linear (un-captured) genomic DNA is digested away with exonuclease (leaving only the captured targets within the circularized probes). The probes are then sequenced and sequence data is assembled together. That assembled sequence is analyzed for mutations.
A problem with tiling is that multiple probes contain a portion of the same sequence on the same nucleic acid strand, and therefore compete with each other to bind the same region on the same strand. That competition results in fewer capture events per targeted genomic region and thus decreases capture efficiency.