Human tumors often display substantial intratumoral heterogeneity in both phenotypic and molecular features. This cellular heterogeneity represents a formidable challenge to the discovery of effective and lasting cancer treatments. The frequency and degree of tumor heterogeneity cannot be explained solely by genetic determinants. Additionally, the reversible nature of cancer cell proliferative potential and drug tolerance suggests mechanisms that invoke plasticity, characteristic of epigenetic regulation.
Dynamic control of gene expression is exerted by the interplay between various epigenetic mechanisms, including DNA methylation, histone tail post-translational modifications, and nucleosome positioning and occupancy (Schreiber and Bernstein 2002; Fuks 2005; Esteller 2007). Dysregulation of any of these regulatory layers can alter gene expression and, moreover, such epigenetic perturbations have been established as major determinants of cancer initiation and progression. Epigenetic variability has been strongly implicated in mediating tumor heterogeneity across diverse diseases. However, the extent to which epigenetic differences between individual cells underlie intratumoral heterogeneity remains relatively unexplored.
Aberrant DNA methylation of CpG (or CG) dinucleotides is a well-documented phenomenon in virtually all tumor types studied to date. It is widely accepted that DNA methylation near transcriptional start sites (TSSs) is associated with gene silencing. Hypermethylation of promoters of tumor-suppressive genes and hypomethylation of tumor-promoting genes is commonly observed, even in early stages of carcinogenesis (Herman and Baylin 2003). Though it is often evaluated in isolation, DNA methylation exerts control over gene expression within the context of chromatin. Expressed and poised genes are usually unmethylated and depleted of nucleosomes near their TSSs, thereby exhibiting increased accessibility to trans-activating factors (reviewed in Jiang and Pugh 2009). Conversely, the TSSs of inactive genes tend to be associated with high nucleosome occupancy, conferring chromatin inaccessibility, but can be either unmethylated or methylated. Thus, integrated evaluation of DNA methylation within the context of chromatin accessibility is likely to be more informative than evaluating each epigenetic feature separately. Notably, the extent of cell-to-cell heterogeneity in chromatin accessibility at gene promoters in either disease-free or tumor cells remains ill defined.
Assessing intratumoral epigenetic heterogeneity necessitates the use of methods able to query chromatin structure at the level of single molecules, thereby avoiding population averaging. A high-resolution DNA footprinting technique, termed MAPit (DNA methyltransferase accessibility protocol for individual templates) was previously developed which exploits exogenous addition of DNA methyltransferases (DNMTs), such as the GC DNA methyltransferase (M.CviPI) to probe accessibility of GC sites in chromatin (Xu et al. 1998; Pardo et al. 2009). Following bisulfite conversion of isolated genomic DNA and sequencing of clonally amplified molecules, that is, bisulfite genomic sequencing (BGS), the positions of nucleosomes and DNA-bound non-histone proteins are inferred based on footprints or spans of protection against methylation by M.CviPI. Furthermore, because M.CviPI modifies GC, endogenous CG methylation is concurrently mapped, allowing for direct correlation of two distinct epigenetic features along a single strand of DNA (molecule). This technique has been used to simultaneously map DNA methylation and nucleosome positions in many gene-specific studies (Kilgore et al. 2007; Wolff et al. 2010; Delmas et al. 2011; You et al. 2011; Yang et al. 2012), and more recently, genome wide (Kelly et al. 2012).
Cells that are drug-tolerant or have tumor-initiating capabilities are of high biological interest and are estimated to represent 1-5% of bulk tumor cells. Study of this or other minority subpopulations by genome-wide BGS is currently precluded due to requirements for large amounts of input DNA and prohibitive costs associated with obtaining the needed depth in sequencing coverage. The latter problem is compounded as the number of samples to be analyzed increases. A further limitation of present genome-wide BGS approaches is the short sequencing reads typically employed. Short-read sequences destroy the structural integrity or phasing of epigenetic information present on a continuous DNA strand, which is essential for determining if epigenetic features map to the same or different molecules. Maintaining the continuity of epigenetic information is of increased importance in complex samples with abundant inherent diversity.
To circumvent these limitations, the current invention provides a method of simultaneously determining chromatin structure and DNA methylation state of one or more (or a plurality of) genetic loci using deep sequencing techniques that provided for high sequencing coverage and long reads of genetic material.