The ability to sequence genomes accurately and rapidly is revolutionizing biology and medicine. The study of complex genomes, and in particular, the search for the genetic basis of disease in humans, involves genetic analysis on a massive scale. Such genetic analysis on a whole genome level is costly not only monetarily but also in time and labor. These costs increase with protocols involving analyses of separate individual DNA samples. Sequencing (and re-sequencing) of polymorphic areas in the genome that are linked to disease development will contribute greatly to the understanding of diseases, such as cancer, and therapeutic development and will help meet the pharmacogenomics challenge to identify the genes and functional polymorphisms associated with the variability in drug response. Screens for numerous genetic markers performed for populations large enough to yield statistically significant data are needed before associations can be made between a given genotype and a particular disease.
One way to reduce the costs associated with genome sequencing while retaining the benefits of genomic analysis on a large scale is to perform high throughput, high accuracy sequencing on targeted regions of the genome. A widely used approach captures much of the entire protein coding region of a genome (the exome), which makes up about 1% of the human genome, and has become a routine technique in clinical and basic research. Exome sequencing offers advantages over whole genome sequencing: it is significantly less expensive, is more easily understood for functional interpretation, is significantly faster to analyze, makes very deep sequencing affordable, and results in a dataset that is easier to manage. A need exists for methods, systems and compositions for the enrichment of target regions of interest for high accuracy and high throughput sequencing and genetic analysis.