The field of the invention relates to methods for sequencing polynucleotide samples. In particular, the field of the invention relates to methods for identifying for bias in sequencing of polynucleotide samples. The methods may be adapted for identifying and adjusting for bias in sequencing methods utilized for diagnosing, prognosing, and treating patients having a disease or disorder.
Over the last thirty years, detection of genetic mutations and epigenetic modifications has emerged as an important clinical tool in medicine. Mutations and epigenetic changes, such as methylation, have been detected using methods that utilize restriction enzymes (e.g., methylation-specific digital karyotyping (MSDK) and combined bisulfite restriction analysis (COBRA)), the polymerase chain reaction (PCR) (e.g., methylation specific PCR (MSP), heavy methyl PCR, and methylight PCR), hybridization (e.g., epi-microarrays, and bead-arrays), and DNA sequencing (e.g., clonal (pyro/Sanger) sequencing, or synthesis-type sequencing). In particular, advances in DNA sequencing technology that have reduced the cost of DNA sequencing have allowed comprehensive investigation of the genetics and epigenetics of diseases.
However, because of various challenges, genome-wide sequencing using enrichment sequencing has been mainly performed in research settings only and has not been adapted for clinical diagnostics. For example, some of the challenges for genome-wide bisulfite sequencing (BSS) include the fact that clinical samples are highly heterogeneous and contain alleles having varying degrees of methylation. In order to detect methylation at a given position (or the lack of methylation at a given position) with a sufficient degree of sensitivity and specificity requires a relatively large sample size. Therefore, new methods of DNA sequencing that improve the sensitivity and specificity of detection of epigenetic modifications and genetic mutations using relatively small sample sizes are desirable. In particular, these new methods of DNA sequencing should address the challenges of utilizing genome-wide sequencing as a tool in clinical diagnosics.