The DNA in a single human cell, if fully elongated, would stretch for roughly two meters. However, each cell compacts these incredibly long polymeric molecules within the cell's nucleus, a space just a few microns across. Increasingly, researchers are finding that this compacted conformation is not just a random “ball of string”, but rather is highly organized and dynamically regulated (Cremer et al., 2001, Nat Rev Genet 2: 292-301). Developing a better understanding of this compacted form will provide key insights in how the spatial organization of genetic information in the nucleus affects its readout via gene expression.
Gene expression involves the transcription of mRNA molecules from the genome followed by the export of those mRNAs to the cytoplasm. Evidence indicates that the structure of chromosomes, generally organized into distinct territories in the nucleus (Cremer et al., 2001, Nat Rev Genet 2: 292-301), play a role in both processes (Fraser et al., 2007, Nature 447: 413-417; Misteli, 2007, Cell 128: 787-800; Papantonis et al., 2010, Curr Opin Cell Biol. 22(3):271-6). Indeed, a numerical accounting of transcription reveals that the abundances of the large enzyme complexes required for RNA polymerization and splicing range between hundreds to thousands per cell that must somehow transcribe the many tens of thousands of genes in the nucleus (Jackson et al., 1998, Mol Biol Cell 9: 1523-1536; Osborne et al., 2004, Nat Genet 36: 1065-1071; Wansink et al., 1993, J Cell Biol 122: 283-293). Some data suggest that sets of actively transcribed genes localize to “transcription factories” (Iborra et al., 1996, J Cell Sci 109: 1427-1436; Schoenfelder et al., 2010, Nat Genet 42: 53-61; Sexton et al., 2007, Semin Cell Dev Biol 18: 691-697). Limited accessibility to and transient formation of these factories may underlie the intermittent nature of gene expression that both we and others have observed (Raj et al., 2006, PLoS Biol 4: e309; Raj et al., 2008, Cell 135: 216-226).
Meanwhile, for many years, researchers reasoned that the high density of polymeric materials in the nucleus would make RNA diffusion prohibitively slow, leading to the “gene gating” hypothesis (Blobel, 1985, Proc Natl Acad Sci USA 82: 8527-8529) that genes themselves must move to nuclear periphery for the export of the RNA to be reasonably efficient. Indeed, there is some evidence for this, with some genes moving to the periphery upon activation and physically associating with nuclear pores complexes (Brown et al., 2007, Curr Opin Genet Dev 17: 100-106; Capelson et al., 2010, Cell 140: 372-383; Casolari et al., 2004, Cell 117: 427-439; Kalverda et al., 2010, Cell 140: 360-371). However, biophysical analyses have shown that RNA diffusion in the nucleus is rather rapid and thus not limiting (Vargas et al., 2005, Proc Natl Acad Sci USA 102: 17008-17013), and the majority of genes appear to remain in the nuclear interior (often exhibiting silencing at the periphery, in fact (Chuang et al., 2006, Curr Biol 16: 825-831; Kosak et al., 2002, Science 296: 158-162)).
Unfortunately, there is a lack of a clear picture of chromosome structure and gene expression, which stems from the lack of effective tools for measuring genetic structure and function simultaneously at the single cell level. Most conventional imaging assays for chromosome structure focus only on the position of one or two loci at a time, whereas global biochemical approaches provide indirect measurements averaged over populations of many thousands to millions of cells. In either case, gene expression data are difficult to obtain. For these reasons, questions remain regarding how spatially ordered or disordered gene expression is on interphase chromosomes, or regarding the variability of chromosome configurations, and how they contribute to variability in gene expression.
Currently, two broad classes of methods are used to study chromosome structure. One method is chromosome conformation capture (3C) (Dekker et al., 2002, Science 295: 1306-1311) and its variants, most notably (Duan et al., 2010, Nature 465(7296):363-7; Lieberman-Aiden et al., 2009, Science 326: 289-293). 3C is a biochemical technique that measures the frequency of direct physical interactions between genomic loci in populations of cells. While the Hi-C incarnation of the method yields genome wide information on these interactions, it does have a number of drawbacks. For example, it does not yield single cell data, it does not give any information about expression, and it only yields interaction probabilities rather than chromosome structure per se.
The other approach is DNA FISH, involving the detection of fluorescently labeled probes targeting small or large regions of DNA (or even whole chromosomes). This approach is more direct, but studies so far have been limited to only a few loci, limiting its scope. Most importantly though, the harsh conditions required to denature DNA in preparation for DNA FISH cause significant RNA degradation, generally precluding its combination with RNA FISH methods except for highly abundant targets (Chaumeil et al., 2006, Genes Dcv 20: 2223-2237) or using signal amplification methods that are difficult to multiplex (Takizawa et al., 2008, Genes Dev 22: 489-498).
There is a need for a method that allows one to measure gene expression of one or many genes via RNA FISH while simultaneously obtaining chromosomal structural information about chromosomes. The present invention satisfies this need.