Chromatin, the assemblage of protein and DNA that is the physiologic form of the genome, is a crucial regulator of underlying DNA function, playing key roles in all aspects of DNA metabolism, cell and whole organism function. The fundamental repeating unit of chromatin structure is the nucleosome: a DNA-binding spool of eight core histone proteins (two copies of H2A, H2B, H3, and H4), around which nearly two full turns of genomic DNA are wrapped. Individual nucleosomes may be generated, for example, by micrococcal nuclease digestion. The histones include the H1, H2A, H2B, H3, and H4 histones and may be modified to include a plurality of epitopes and post-translational modifications.
In the cell, post-translational modifications (or variation of the amino acid sequence) of the histone are able to regulate changes in local chromatin states that govern the accessibility of underlying DNA, regulating processes that range from transcriptional activation to gene silencing. These chemical modifications are referred to as “epigenetic marks” and add another layer of information without altering the standard base-pairing capacity of DNA and seem to act in concert with one another and other distinguishing chromatin features to control the genome. Cellular processes as varied as transcription, replication, stem cell pluripotency, gene silencing, X-chromosome inactivation, DNA repair, apoptosis, epigenetic inheritance, cellular identity retention, hematopoiesis, cancers, numerous disorders of the central nervous system, cardiovascular disease, diabetes, obesity, bacterial infections, and gene expression programs during development all appear to involve epigenetic modifications in their course or causation.
Chromatin immunoprecipitation (ChIP) is the central methodology for querying where these epigenetic modifications exist in the genome as well as tracking their changes as a function of cellular identity in development and pathological transitions (e.g., hematopoetic stem cell to leukemia). ChIP is well known in the art. In brief, ChIP is a pull-down assay that relies on fragmenting genomic material of living organisms by mechanical, physical, chemical or enzymatic shearing to generate a pool of protein-DNA fragments (largely nucleosomes) that can then be probed with an affinity reagent such as an antibody that binds a particular protein or posttranslational modification thereof to pull-down specific fragments of chromatin. ChIP uses affinity capture from a pool of fragmented chromatin “input” to enrich fragments that bear the epitope of interest. The identity, relative abundance and position in the genome of the indirectly captured DNA fragments can be identified by numerous techniques including RT-PCR, Next Generation Sequencing, ddPCR, qPCR, microarray probe hybridization and other methods with capability to read out and quantify DNA sequence, all of which are known in the art.
This information about the position of DNA associated with protein in situ can be then used to infer the position of the bound protein to the DNA in the intact genome, and provide an assessment of how much bound material was present at that DNA loci as compared to the frequency of that sequence in the initial pool of fragments subjected to affinity capture, i.e., “the input”, or relative to some other genomic locus. In other words, the captured material is analyzed by qPCR, next generation sequencing, or the like and compared to negative controls to assess the relative enrichment afforded by the immunoprecipitation, also known as pull-down. Notably, present technology answers the “where in the genome” question in a relative sense, without providing meaningful information about the actual abundance of the targeted epitope at that site. Nevertheless, ChIP has provided insight into how a combination of positioning, histone marks and histone variants can regulate gene expression (Henikoff, 2008; Jiang and Pugh, 2009; Li and Carey, 2007) and how these changes can regulate cell differentiation (Bernstein et al., 2007). Moreover, it is a crucial tool in understanding the role of epigenetics in cancer and other diseases, including discovery of disease markers (Dawson and Kouzarides, 2012; Feinberg, 2007).
Despite serving as the central experimental technique in epigenetics research, chromatin immunoprecipitation coupled to deep sequencing (ChIP-seq) or other analysis suffers from several serious drawbacks. First, each ChIP measurement is relative, it is not standardized to any reference, which hinders direct comparison of data coming from different repetitions of the same sample, different cells, and different patients. Second, ChIP is heavily dependent on the quality of antibody reagents which vary in specificity and affinity even within different batches of the same antibody, which can have significant affinity for off-target epitopes often leading to false-positive detection and misinterpretation of the data (Bock et al., 2011; Nady et al., 2008; Park, 2009; Fuchs et al., 2011; Landt et al., 2012; Egelhofer et al., 2011). The greatest source of experimental error in ChIP is the quality of the antibody affinity reagents employed to capture desired epitopes (either histone modifications, variants or transcription factors). The troubling promiscuity of “ChIP grade” antibody binding revealed using immobilized arrays of related peptide epitopes (Bock et al., 2011; Egelhofer et al., 2011; Fuchs et al., 2011), is compounded by increasingly sophisticated measures of affinity, specificity and reproducibility; up to 80% of several hundred commercial antibodies failed stringent quality controls (Egelhofer et al., 2011; Landt et al., 2012). Even different lots of the same commercial antibody can vary in apparent affinity for target by up to 20-fold (Hattori et al., 2013) and display marked specificity differences (Nishikori et al., 2012). Yet at present, there are no available measures of antibody specificity within ChIP experiments available, leading to substantial uncertainty in evaluating the data. Third, even with equivalent antibody affinity and specificity for two different epitopes, the wide variability of epitope abundance would preclude meaningful comparison of ChIP results (Leroy et al., 2013; Young et al., 2009). Finally, very small differences in ChIP preparation can yield significant differences in the output data, leading to inconsistency from experiment to experiment. Differences in experimenter handling (Marinov et al., 2014), as well as loading equivalent quantities of sample in each sequencer lane despite differential amplification (Zhang and Pugh, 2011) render unbiased ChIP-based comparisons problematic.
Because ChIP data are expressed on a relative scale that is severely dependent on the precise experimental conditions, normalization ultimately requires assumptions that may not be warranted (Bin Liu et al., 2013; Liang and Keles, 2012), or the bulk of experimental data must be sacrificed in peak calling to permit comparisons (Zhang et al., 2008). Beyond peak calling, there are few widely applied ChIP-seq quality controls, yet in the worst cases, ChIP is not reproducible (Egelhofer et al., 2011; Landt et al., 2012; Marinov et al., 2014). Yet none of these factors are taken into account in current methodologies or technologies. With present ChIP technology, it is impossible to measure the absolute densities of histone modifications in a locus-specific manner. Consequently, the peaks of different histone modifications that seem to overlap on certain genomic loci cannot be meaningfully compared. Moreover, experimental variation and pitfalls that are opaque to the experimenter preclude ChIP assays from serving as reliable patient diagnostics (despite clear connections between the epigenetic marks it measures and numerous disease states), as well as hinder the utility of ChIP in basic science research.