Whole genome sequencing, genotyping, targeted resequencing, and gene expression analyses of tissue samples can be of significant importance for the identification of disease biomarkers, for the accurate diagnosis and prognosis of disease, and for the selection of a patient's treatment. For example, nucleic acid sequence analysis of tumor tissue excised from a patient can be used to determine the presence or absence of particular genetic biomarkers, e.g., somatic variants, structural rearrangements, point mutations, deletions, and insertions, and/or the presence or absence of particular genes. In addition, analysis of ancient nucleic acids can yield a wealth of information including the evolution of genes and organisms, and the migration and ancestry of populations and individuals.
Fixed tissue samples, such as formalin-fixed and paraffin-embedded (FFPE) pathological samples, are often prepared from patients for histological analysis and archival storage. Nucleic acids in such FFPE samples are often of low quality with significant fragmentation, an increased proportion of single stranded DNA, and a variety of chemically induced DNA lesions including strand breakage, abasic sites and chemically modified bases. Often the amount of DNA that can be extracted from FFPE samples and then analyzed is small. Similarly, ancient nucleic acid samples are often of low quality with factors such as time, temperature, and the presence of water degrading the nucleic acids. Ancient nucleic acids may contain a large number of mutations that increase with time, such as substitutions from the deamination of residues.
The quality and small amounts of DNA that may be prepared from low quality nucleic acid samples make such samples difficult to use in preparing sequencing libraries of sufficient yield, complexity and genomic coverage. Thus, methods and compositions are desirable for the enrichment of nucleic acids obtained from low quality nucleic acid samples that are suitable for nucleic acid sequence analysis.