Epigenetics concerns the transmission of information from a cell or multicellular organism to its descendants without that information being encoded in the nucleotide sequence of genes. Epigenetic mechanisms can operate through chemical modification of the DNA or through post translational modifications to proteins and polypeptides associated with the DNA. RNAs, including long non-coding RNAs have also been implicated in epigenetic regulation.
The location and identity of nucleic acid sequences is critical to information storage and regulation of cell state; this is particularly evident in the regulation of chromatin structure and function. For example, the genomes of eukaryotic cells, DNA is associated with protein and ribonucleic acid (RNA) complexes that assist in regulating gene expression, packaging of the DNA and controlling replication. The myriad of factors that are associated with the genome contribute to what is termed chromatin: the nuclear material present in the nucleus of most eukaryotic cells. At various times in the cell cycle the level of packaging (or condensation) of the genomic DNA can vary between a lower packaged state such as during replication of the DNA (S Phase) to a more condensed state such as during cell division (M phase) where the genome is packaged into chromosomes. Highly expressed genes also tend to exist in a state of low packaging (so called euchromatic state), whereas silenced genes exist in a state of high packaging (so called heterochromatic state). The relative state of condensation, maintenance of this state and the transition between heterochromatin and euchromatin is believed to be mediated largely by a plurality of specialist proteins, RNAs and polypeptide complexes. For example, the roX non-coding RNAs found in flies act with a protein complex to open chromatin and increase transcription on the male X chromosome. Conversely and the mammalian Xist non-coding RNA coats one of the female X chromosomes and causes it to condense into heterochromatin.
At a fundamental level, the most ‘open’ or euchromatic form of chromatin comprises short sections of the genomic DNA wound around an octet of histone proteins, that together form a nucleosome. The nucleosomes are arrayed in series to form a beads-on-a-string formation. Interactions between adjacent nucleosomes allow the formation of more highly ordered chromatin structures. It is these interactions that can be mediated by enzymes that catalyse post-translational modifications of histones, or structural proteins that physically interact with and assist in anchoring the histones together.
Epigenetic controls over chromatin organisation and stability are essential for the normal and healthy functioning of a cell. Aberrant epigenetic modifications and a decrease in chromatin stability are often seen in senescent, apoptotic or diseased cells, particularly in cancer cells. It is of considerable importance to identify and characterise the multiple proteins and polypeptides that are capable of exhibiting epigenetic activities, as well as those factors that are capable of interacting with chromatin and chromatin associated proteins. It would also be of great value to identify and characterise novel chromatin associated factors, not least to facilitate a better understanding of chromatin biology as a whole.
Conventionally, isolation of proteins associated with chromatin has been achieved by performing a chromatin immunoprecipitation (ChIP). In a typical ChIP assay the chromatin binding proteins are crosslinked to DNA with formaldehyde in vivo. The chromatin is then sheared into small fragments and purified. The purified chromatin fragments are probed with antibodies specific to a known target chromatin binding protein so as to isolate the complex by immunoprecipitation. The precipitated chromatin is treated to reverse the cross-linking, thereby releasing the DNA for sequence analysis. Although it is possible to investigate the ancillary associated proteins pulled down by the cross-linking, the method is not restricted to one genomic region and is not optimised for this. Protocols for performing ChIP are disclosed in Nelson et al. (Nature Protocols (2006) 1:179-185) and Crane-Robinson et al. (Meth. Enzym. (1999) 304:533-547). Furthermore, while ChIP is useful for probing protein regulatory factors across the genome, there are no analogous techniques to determine the binding sites of RNA factors.
A significant drawback with ChIP based techniques is that for a given sequence, at least one specific protein associated with that sequence must be known already. Hence, is a need for a method of isolating protein factors that associate directly or indirectly with a specified target nucleic acid sequence. In effect, there is a need for a method of chromatin associated protein or polypeptide isolation that is nucleic acid sequence driven rather than antigen driven. Also, in ChIP a lack of immunoprecipitation does not necessarily reflect an absence of the tested factor, so there is always a risk of false negative results with this technique.
The present invention overcomes the deficiencies in the art by providing a novel method for isolating factors that associate directly or indirectly with a given target nucleic acid sequence. In particular the method of the invention overcomes the aforementioned problems (1) with regard to isolating novel chromatin binding RNAs and polypeptides and (2) with analyzing the factors associated with a regulatory RNA including its DNA binding sites.