The current model of chromatin structure in higher eukaryotes postulates that genes are organized in “domains” (Dillon & Grosveld, 1994, Curr. Opin. Genet. Dev. 4:260-264; Higgs, 1998, Cell, 95:299-302, each of which is incorporated herein by reference). Chromatin domains are envisaged to exist in either a condensed, “closed”, transcriptionally silent state, or in a de-condensed, “open” and transcriptionally competent configuration. The establishment of an open chromatin structure characterized by increased DNAseI sensitivity, DNA hypomethylation and histone hyperacetylation, is considered a pre-requisite to the commencement of gene expression.
The open and closed nature of chromatin regions is reflected in the behaviour of transgenes that are randomly integrated into the host cell genome. Identical constructs give different patterns of tissue-specific and development stage-specific expression when integrated at different locations in the mouse genome (Palmiter & Brinster, 1986, Ann. Rev. Genet., 20:465-499; Allen, et al., 1988, Nature, 333: 852-855; Bonnerot, et al., 1990, Proc. Natl. Acad. Sci. USA, 87:6331-6335, each of which is incorporated herein by reference). A variegated expression pattern within a given transgenic mouse tissue, known as position effect variegation (PEV), is also frequently observed (Kioussis & Festenstein, 1997, Curr. Opin. Genet. Dev., 7:614-619, which is incorporated herein by reference). When exogenous genes are integrated into the chromosome of mammalian cells cultures in vitro, many of the integration events result in rapid silencing of the transgene and the remainder give large variability in expression levels (Pikaart et al., 1998, Genes Dev., 12:2852-2862; Fussenegger, et al., 1999, Trends Biotech., 17:35-42, each of which is incorporated herein by reference). These position effects render transgene expression inefficient, with implication for both basic research and biotechnology applications.
The chromatin domain model of gene organization suggests that genetic control elements that are able to establish and maintain a transcriptionally competent open chromatin structure should be associated with active regions of the genome.
Locus Control Regions (LCRs) are a class of transcriptional regulatory elements with long-range chromatin remodelling capability. LCRs are functionally defined in transgenic mice by their ability to confer site-of-integration independent, transgene copy number-dependent, physiological levels of expression on a gene linked in cis, especially single copy transgenes (Fraser & Grosveld, 1998, Curr. Opin. Cell Biol., 10:361-365; Li et al., 1999, Trends Genet., 15:403-408, each of which is incorporated herein by reference). Crucially, such expression is tissue-specific. LCRs are able to obstruct the spread of heterochromatin, prevent PEV (Kioussis & Festenstein, 1997, supra) and consist of a series of DNAse I hypersensitive (HS) sites which can be located either 5′ or 3′ of the genes that they regulate (Li et al., 1999, supra).
LCRs appear to be comprised of two separate, although not necessary independent components. First, the establishment of an open chromatin domain, and second a dominant transcriptional activation capacity to confer transgene copy number dependent expression (Fraser & Grosveld, 1998, supra). The molecular mechanisms by which LCRs exert their function remain a point of contention (Higgs, 1998, supra; Bulger & Groudine, 1999, Genes Dev., 13:2465-2477; Grosveld, 1999, Curr. Opin. Genet. Dev., 9:152-157; Bender et al., 2000, Mol. Cell, 5:387-393, each of which is incorporated herein by reference).
The generation of cultured mammalian cell lines producing high levels of a therapeutic protein product is a major developing industry. Chromatin position effects make it a difficult, time consuming and expensive process. The most commonly used approach to the production of such mammalian “cell factories” relies on gene amplification induced by a combination of a drug resistance gene (e.g., DHFR, glutamine synthetase (Kaufman, 1990, Methods Enzymol., 185:537-566, which is incorporated herein by reference)) and the mainatenance of stringent selective pressure. The use of vectors containing LCRs from highly expressed gene domains, using cells derived from the appropriate tissue, greatly simplifies the procedure (Needham et al., 1992, Nucleic Acids Res., 20:997-1003; Needham et al., 1995, Protein Expr. Purif., 6:124-131, each of which is incorporated herein by reference).
However, the tissue-specificity of LCRs, although useful in some circumstances, is also a major limitation for many applications, for instance where no LCR is known for the tissue in which expression is required, or where expression in many, or all, tissues is required.
Our co-pending patent application PCT/GB99/02357 (WO 00/05393), incorporated by reference herein, describes elements that are responsible for establishing an open chromatin structure across a locus that consists exclusively of ubiquitously expressed, housekeeping genes. These elements are not derived from an LCR. The invention provides a polynucleotide comprising a ubiquitous chromatin opening element (UCOE) which opens chromatin or maintains chromatin in an open state and facilitates reproducible expression of an operably-linked gene in cells of at least two different tissue types, wherein the polynucleotide is not derived from a locus control region.
Methylation-free CpG islands are well-known in the art (Bird et al., 1985, Cell, 40:91-99, Tazi & Bird, 1990, Cell, 60:909-920, each of which is incorporated herein by reference) and may be defined as CpG-rich regions of DNA with above average (>60%) content of CpG di-nucleotides where the cytosine residues are not methylated and which extend over the 5′ ends of two closely spaced (0.1-3 kb) divergently transcribed genes. These regions of DNA remain unmethylated in all tissues throughout development (Wise & Pravtcheva, 1999, Genomics, 60:258-271, which is incorporated herein by reference). They are associated with the 5′ ends of all ubiquitously expressed genes, as well as an estimated 40% of genes showing a tissue restricted expression profile (Antequera & Bird, 1993, Proc. Natl. Acad. Sci. USA, 90:11995-11999; Cross & Bird, 1995, Curr. Opin, Genet. Dev. 5:309-314, each of which is incorporated herein by reference) and are known to be localized regions of active chromatin (Tazi & Bird, 1990, supra).
An “extended” methylation-free CpG island is a methylation-free CpG island that extends across a region encompassing more than one transcriptional start site and/or extends for more than 300 bp and preferably more than 500 bp. The borders of the extended methylation-free CpG island are functionally defined through the use of PCR over the region in combination with restriction endonuclease enzymes whose ability to digest (cut) DNA at their recognition sequence is sensitive to the methylation status of any CpG residues that are present. One such enzyme is HpaII, which recognizes and digests at the site CCGG, which is commonly found within CpG islands, but only if the central CG residues are not methylated. Therefore, PCR conducted with HpaII-digested DNA and over a region harboring HpaII sites, does not give an amplification product due to HpaII digestion if the DNA is unmethylated. The PCR will only give an amplified product if the DNA is methylated. Therefore, beyond the methylation-free region HpaII will not digest the DNA a PCR amplified product will be observed thereby defining the boundaries of the “extended methylation-free CpG island.”
We have demonstrated that regions spanning methylation-free CpG islands encompassing dual, divergently transcribed promoters from the human TATA binding protein (TBP)/proteosome component-B 1 (PSMBI) and heterogenous nuclear ribonucleoprotein A2/B1 (hnRNPA2)/heterochromatin protein 1Hsγ (HP1 Hsγ) gene loci give reproducible, physiological levels of gene expression and that they are able to prevent a variegated expression pattern and silencing that normally occurs with transgene integration within centromeric heterochomatin.
We have shown that methylation-free CpG islands associated with actively transcribing promoters possess the ability to remodel chromatin and are thus thought to be a prime determinant in establishing and maintaining an open domain at housekeeping gene loci.
UCOEs confer an increased proportion of productive gene delivery events with improvements in the level and stability of transgene expression. This has important research and biotechnological applications including the generation of transgenic animals and recombinant protein products in cultured cells. We have shown beneficial effects of UCOEs on expression of a cytomegalovirus-enhanced green fluorescent protein (CMV-EGFP) reporter construct and with the secreted, pharmaceutically valuable protein erythropoietin. The properties of UCOEs also suggest utility in gene therapy, the effectiveness of which is often limited by a low frequency of productive gene delivery events and an inadequate level and duration of expression (Verma & Somia, 1997, Nature, 389:239-242, which is incorporated herein by reference).
Given these significant implications and wide ranging applications, there is a desire to further optimize transgene expression levels and achieve improved stability of gene expression over a prolonged period of culture.
One particular need is to overcome the directional bias observed in some naturally-occurring UCOEs. Although UCOEs confer position-independent transcriptional enhancement on operably-linked promoters, this is, to some extent, orientation-dependent (i.e., the UCOE is significantly more effective in one orientation than the other). In some circumstances, such as an expression vector comprising two expression units transcribed divergently with a UCOE situated between them, there is an advantage in being able to obtain balanced, high-level expression from both promoters, which may not be possible with a natural UCOE. There is therefore a need for artificially-constructed UCOEs that are effective in both orientations.