The interactions between protein and DNA are critically involved in a wide range of biological processes and disease conditions including cancer. Chromatin immunoprecipitation (ChIP) assay has become the technique of choice for examining in vivo DNA-protein interactions over the years. In a typical ChIP experiment, the DNA-binding protein (e.g. a transcription factor or a histone) is crosslinked to DNA in vivo by treating cells with formaldehyde. The cells are then lysed in order to release chromosomes and the chromatin is sheared by sonication into small fragments of 200-600 bp in the size. DNA fragments associated with the protein are then enriched by immunoprecipitation (e.g., using immunoprecipitation (IP) beads coated by an antibody specific to the transcription factor or histone). Finally, the crosslinks are reversed and the released DNA is assayed to determine the sequences bound by the protein. The identification of the DNA sequences can be done by qPCR if there are known candidate promoters. Alternatively, these binding sites can be mapped at the genome scale by hybridization into a microarray (ChIP-chip) or by sequencing (ChIP-seq) using high-throughput sequencing technology (e.g. Illumina genome analyzer). In general, ChIP-seq has higher resolution, fewer artifacts, greater coverage and a larger dynamic range than ChIP-chip and provides data of improved quality.
Although current ChIP-related assays have been generating useful data, the technique has some serious limitations. First, a key limitation is the requirement for a large number of cells (>106 cells per IP for ChIP-qPCR and 107-108 cells for ChIP-seq). This is usually feasible with cell lines but poses a serious challenge when primary cells are used. The sample amount generated by lab animals and patients is very limited. For example, the number of naturally occurring T regulatory cells in murine splenocytes is ˜10,000 per spleen, and ˜5000 per ml peripheral blood leukocyte. Circulating tumor cells are present by the frequency of 1-10 per ml of whole blood in patients with metastatic cancer. In addition, primary samples typically contain a mixture of different cell types. The enrichment and isolation of a homogenous single cell type not only add time and labor to the protocol but also generate further loss in the sample amount. Second, most ChIP assays involve extensive manual handling and take 3-4 days or longer to finish. These cumbersome procedures may create loss of materials and technical errors that lead to inconsistencies between replicates. There have been modifications and improvements made to ChIP protocols to make the assays shorter and easier and more importantly, allow use of small cell populations (e.g. ˜100-1000 cells for ChIP-qPCR, and ˜10,000 cells, involving whole genome amplification, for ChIP-seq). However, most of these improved protocols still involve a significant amount of manual processing.
Standard next-generation sequencing protocols require a sufficient amount of DNA (˜5 ng). Thus high efficiency extraction of ChIP DNA from cells often represents a critical roadblock for sensitive ChIP assays. Recent studies by Bernstein and his co-workers developed Nano-ChIP-seq procedure for performing ChIP-seq assays with limited samples. Although they achieved successful DNA sequencing from as few as 10,000 cells, only about 10˜50 pg of DNA is pulled down during the immunoprecipitation step. With this tiny amount of DNA, extensive amplification steps were required to generate enough material for library preparation and sequencing. However, pre-amplification also tends to introduce artifacts and biases and leads to low-quality results from DNA sequencing.