For a particular patient, a doctor may want to analyze one or more particular (target) regions of the patient's genome (e.g., 100-500 bases per region). For example, a specific part of a gene of a patient may be tested for mutations. As only certain regions are to be analyzed, techniques have been developed for increasing the percentage of genomic segments (e.g., DNA fragments) in a sample that are from the target region(s). Such techniques include amplification and enrichment of a target region.
In amplification, primers that hybridize to a target region are used to amplify genomic segments that have sequences that correspond to the target region. The desired result is that the sample would contain many genomic segments of the target region, and thus when the genomic segments are sequenced, a high percentage of the reads would correspond to the target region. Thus, significant sequencing effort is not wasted in sequencing genomic segments from non-target regions of the genome. In enrichment, probes that hybridize to a target region can be used to capture genomic segments that correspond to the target region, thereby increasing the percentage of reads that correspond to the target region.
However, in both amplification and enrichment, genomic segments from other parts of the genome are still read. As a consequence, current techniques align (map) the reads to the entire genome to ensure accuracy, particularly when a target region is being analyzed for mutations relative to a reference genome. That is, once a sequence read is obtained, the sequence is compared to the reference genome to find the genomic location that is a best match to the read. After the reads have been aligned, the reads that aligned to a target region are then analyzed. This alignment to the entire genome is computationally expensive.
It is therefore desirable to provide improved methods, systems, and apparatuses that are more computationally efficient.