The study of nucleic acid interactions, in particular interactions in nucleic acid-protein complexes and in chromatin, is important in understanding gene regulation and expression.
Studies of transcription factor binding sites (TFBS) by genome-wide methods such as ChIP-chip (Cawley et al., 2004) and ChIP-PET (Wei et al., 2006), have shown that most TFBS are not located 5′ proximal to genes, suggesting extensive remote regulation. While whole genome ChIP methods can identify TFBS, they may be limited in their capability to detect actual interactions between the TFBS, hence other methods are required.
Analysis of these interactions by electron microscopy, light microscopy and fluorescence in situ hybridisation are constrained by technical limitations, such as lack of resolution, or difficulty in preparing samples, and may not provide enough information on these nucleic acid interactions.
In addition, several recently reported methods developed to study these interactions, purported to be high-throughput and/or unbiased, also fall short of expectations.
Dekker et al., (2002) described Chromosome Conformation Capture (3C), to study nucleic acid interactions. As this method requires the use of primers for analysing interactions, some knowledge of the interacting sites is required, and the method is limited to single point detection.
Carroll et al., (2005) described a method coupling ChIP with 3C. This method indicates the nucleic acid molecules which interact with a particular protein of interest, but otherwise faces the same limitations as 3C.
Simonis et al., (2006) described the chromosome conformation capture-on-chip (4C) technique wherein inverse PCR was first performed using primer sequences targeted to specific sites and subsequently, microarray analysis was used to identify other sites (as available on the microarray) which interact with the targeted sites.
Zhao et al., (2006) described a strategy termed circular chromosome conformation capture (4C) wherein the captured sequences which interact with the specific target sequences are identified by sequencing. This method is limited in its application in that only sequences which interact with specific target sequences can be identified.
Dostie et al., (2007) described a strategy termed 5C which includes performing a multiplex ligation-mediated amplification on a 3C library to generate a 5C library to detect global interactions. The 5C forward and reverse primer sequences are designed for all restriction fragments for the genomic region of interest. As a result, the 5C library is a quantitative “carbon copy” of a part of the 3C library, as determined or limited by the 5C primers used. This method is limited in its application in that only interactions between sites corresponding to 5C primers can be identified.
Ruan et al (US 2007-0238101 A1) provides a method utilising ditags capable of high-throughput, global unbiased interrogation of nucleic acid interactions and binding sites. However, in this prior art method, each tag in the ditag is short and occasionally may not be adequate for identification and/or mapping of the interacting fragments. In addition, this prior art method is not able to separate out interactions between different nucleic acid molecules from self-ligated molecules. Further, ditags may also be formed from long linear composite molecules and these may not reflect actual nucleic acid interactions.
Solexa sequencing or sequencing by synthesis is an example of a recently developed sequencing technology (WO 2007/091077, WO 2007/010252 and WO 2007/123744). Solexa sequencing has been applied to the analysis of ChIP sequences but only in the context of comparing immuno-enriched chromatin compared to control chromatin to identify binding sites of a protein known as NRSF (Johnson et al., 2007).
There is thus a need for more efficient methods and robust technologies that may effectively address three-dimensional chromosomal interactions.