Assays for analysis of biological processes are exploited for a variety of desired applications. For example, monitoring the activity of key biological pathways can lead to a better understanding of the functioning of those systems as well as those factors that might disrupt the proper functioning of those systems. In fact, various different disease states caused by operation or disruption of specific biological pathways are the focus of much medical research. By understanding these pathways, one can model approaches for affecting them to prevent the onset of the disease or mitigate its effects once manifested.
A stereotypical example of the exploitation of biological process monitoring is in the area of pharmaceutical research and development. In particular, therapeutically relevant biological pathways, or individual steps or subsets of individual steps in those pathways, are often reproduced or modeled in in vitro systems to facilitate analysis. By observing the progress of these steps or whole pathways in the presence and absence of potential therapeutic compositions, e.g., pharmaceutical compounds or other materials, one can identify the ability of those compositions to affect the in vitro system, and potentially beneficially affect an organism in which the pathway is functioning in a detrimental way. By way of specific example, reversible methylation of the 5′ position of cytosine by methyltransferases is one of the most widely studied epigenetic modifications. In mammals, 5-methylcytosine (5-MeC) frequently occurs at CpG dinucleotides, which often cluster in regions called CpG islands that are at or near transcription start sites. Methylation of cytosine in CpG islands can interfere with transcription factor binding and is associated with transcription repression and gene regulation. In addition, DNA methylation is known to be essential for mammalian development and has been associated with cancer and other disease processes. Recently, a new 5-hydroxymethylcytosine epigenetic marker has been identified in certain cell types in the brain, suggesting that it plays a role in epigenetic control of neuronal function (S. Kriaucionis, et al., Science 2009, 324(5929): 929-30, incorporated herein by reference in its entirety for all purposes).
In contrast to determining a human genome, mapping of the human methylome is a more complex task because the methylation status differs between tissue types, changes with age, and is altered by environmental factors (P. A. Jones, et al., Cancer Res 2005, 65, 11241, incorporated herein by reference in its entirety for all purposes). Comprehensive, high-resolution determination of genome-wide methylation patterns from a given sample has been challenging due to the sample preparation demands and short read lengths characteristic of current DNA sequencing technologies (K. R. Pomraning, et al., Methods 2009, 47, 142, incorporated herein by reference in its entirety for all purposes).
Bisulfite sequencing is a currently used method for single-nucleotide resolution methylation profiling (S. Beck, et al., Trends Genet 2008, 24, 231; and S. J. Cokus, et al., Nature 2008, 452, 215, the disclosures of which are incorporated herein by reference in their entireties for all purposes). In another widely used technique, methylated DNA immunoprecipitation (MeDIP), an antibody against 5-MeC is used to enrich for methylated DNA sequences (M. Weber, et al., Nat Genet 2005, 37, 853, incorporated herein by reference in its entirety for all purposes). MeDIP has many advantageous attributes for genome-wide assessment of methylation status, but it does not offer as high base resolution as bisulfate treatment-based methods. In addition, it is also hampered by the same limitations of current microarray and second-generation sequencing technologies.
Research efforts aimed at increasing our understanding of the human methylome would benefit greatly from the development of a new methylation profiling technology that does not suffer from the limitations described above. Accordingly, there exists a need for improved techniques for detection of modifications in nucleic acid sequences, and particularly nucleic acid methylation.
Typically, modeled biological systems rely on bulk reactions that ascertain general trends of biological reactions and provide indications of how such bulk systems react to different effectors. While such systems are useful as models of bulk reactions in vivo, a substantial amount of information is lost in the averaging of these bulk reaction results. In particular, the activity of and effects on individual molecular complexes cannot generally be teased out of such bulk data collection strategies.
Nanopore sequencing has been demonstrated to be capable of identifying bases in a single nucleic acid strand passed through the nanopore at single base resolution. The bases can be differentiated by their differential blocking of the nanopore as they pass through the pore. While in some cases, modified bases may be identified by their current blocking characteristics, it can be difficult to differentiate these bases from the four canonical bases and from other modified bases. There exists a need for improved nanopore sequencing that provides more reliable information about the modified bases that occur in natural nucleic acids.