Genetic variation underlies many aspects of disease, and their measurement is important to several fields of research. For example, counting de novo variation in humans, not present in their parents, has led to new insights into the rate at which our species can evolve. Counting genetic or epigenetic changes in tumors can inform fundamental issues in cancer biology. Variations lie at the core of current problems in managing patients with viral diseases such as AIDS and hepatitis by virtue of the drug resistance they can cause. Detection of donor DNA in the blood of organ transplant patients is an important indicator of graft rejection and detection of fetal DNA in maternal plasma can be used for prenatal diagnosis in a noninvasive fashion. In neoplastic diseases, which are all driven by somatic variation, the applications of rare variant detection are manifold; they can be used to help identify residual disease at surgical margins or in lymph nodes, to follow the course of therapy when assessed in plasma, and to identify patients with early, surgically curable disease when evaluated in stool, sputum, plasma, and other bodily fluids.
There is a distinct advantage in the ability to detect variation associated with a disease or condition that occurs at a very low frequency, such as in the case of cancer where the early stages which are most treatable have only a very low frequency of variation that could be detected in a sample (e.g. tissue biopsy or liquid biopsy such as from a blood draw). That problem is further enhanced when dealing with degraded nucleic acid in samples, such as nucleic acid found in formalin-fixed, paraffin-embedded (FFPE) tissue. In those samples, variation that exists at a low frequency in the original sample may have its numbers further reduced via degradation resulting in an even fewer copies of the nucleic acid available for detection.
Methods of sequencing and identifying genetic variations in samples are becoming commonplace. However, standard sequencing approaches are not ideally suited to detect rare variants due to the limits of detection associated with available sequencing platforms. Rare variants can occur at a rate that is lower than the limits of detection of a sequencing platform that may be a rate of occurrence of <=1% in a sample, where sequencing platforms typically have an accuracy rate that is no greater than about 99% even considering that many platforms require significant bioinformatics correction to achieve such accuracy. Thus it is generally appreciated that for rare variants that occur at less than 1%, there is a strong likelihood that the variation is either not identified or is identified but cannot be distinguished from experimental error and background noise of the system.