The current model of chromatin structure in higher eukaryotes postulates that genes are organised in “domains” (Dillon, N. & Grosveld, F. Chromatin domains as potential units of eukaryotic gene function. Curr. Opin. Genet. Dev. 4, 260-264 (1994); Higgs, D. R. Do LCRs open chromatin domains? Cell 95, 299-302 (1998)) Chromatin domains are envisaged to exist in either a condensed, “closed”, transcriptionally silent state, or in a de-condensed, “open” and transcriptionally competent configuration. The establishment of an open chromatin structure characterised by increased DNaseI sensitivity, DNA hypomethylation and histone hyperacetylation, is considered a pre-requisite to the commencement of gene expression.
The open and closed nature of chromatin regions is reflected in the behaviour of transgenes that are randomly integrated into the host cell genome. Identical constructs give different patterns of tissue-specific and development stage-specific expression when integrated at different locations in the mouse genome (Palmiter, R. D. & Brinster, R. L. Ann. Ref. Genet 20, 465-499 (1986); Allen, N. D. et al. Nature 333, 852-855 (1988); Bonnerot, C., Grimber, G., Briand, P. & Nicolas, J. F. Proc. Natl. Acad. Sci. USA 87:6331-6335 (1990)).
A variegated expression pattern within a given transgenic mouse tissue, known as position effect variegation (PEV), is also frequently observed (Kioussis, D. & Festenstein, R. Curr. Opin. Genet. Dev. 7, 614-619 (1997)). When exogenous genes are integrated into the chromosome of mammalian cells cultures in vitro, many of the integration events result in rapid silencing of the transgene and the remainder give large variability in expression levels (Pikaart, M. J., Recillas-Targa, F. & Felsenfield, G. Genes Dev. 12, 2852-2862 (1998); Fussenegger, M., Bailey, J. E., Hauser, H. & Mueller, P. P Trends Biotech. 17, 35-42 (1999)). These position effects render transgene expression inefficient, with implication for both basic research and biotechnology applications.
The chromatin domain model of gene organisation suggests that genetic control elements that are able to establish and maintain a transcriptionally competent open chromatin structure should be associated with active regions of the genome.
Locus Control Regions (LCRs) are a class of transcriptional regulatory elements with long-range chromatin remodelling capability. LCRs are functionally defined in transgenic mice by their ability to confer site-of-integration independent, transgene copy number-dependent, physiological levels of expression on a gene linked in cis, especially single copy transgenes Fraser, P. & Grosveld, F. Curr. Opin. Cell Biol. 10, 361-365 (1998); Li, Q., Harju, S. & Peterson, K. R. Trends Genet. 15: 403-408 (1999). Crucially, such expression is tissue-specific. LCRs are able to obstruct the spread of heterochromatin, prevent PEV (Kioussis, D. & Festenstein, R. Curr. Opin. Genet. Dev. 7, 614-619 (1997)) and consist of a series of DNase I hypersensitive (HS) sites which can be located either 5′ or 3′ of the genes that they regulate (Li, Q., Harju, S. & Peterson, K. R. Trends Genet. 15: 403-408 (1999)).
LCRs appear to be comprised of two separate, although not necessarily independent components. First, the establishment of an ‘open chromatin domain’, and second a dominant transcriptional activation capacity to confer transgene copy number dependent expression (Fraser, P. & Grosveld, F. Curr. Opin. Cell Biol. 10, 361-365 (1998). The molecular mechanisms by which LCRs exert their function remain a point of contention (Higgs, D. R. Cell 95, 299-302 (1998); Bulger, M. & Groudine, M. Genes Dev. 13, 2465-2477 (1999); Grosveld, F. Curr. Opin. Genet. Dev. 9 152-157 (1999); Bender, M. A., Bulger, M., Close, J. & Groudine, M. Mol. Cell 5, 387-393 (2000).
The generation of cultured mammalian cell lines producing high levels of a therapeutic protein product is a major developing industry. Chromatin position effects make it a difficult, time consuming and expensive process. The most commonly used approach to the production of such mammalian “cell factories” relies on gene amplification induced by a combination of a drug resistance gene (e.g., DHFR, glutamine synthetase (Kaufman R J. Methods Enzymol 185, 537-566 (1990)). and the maintenance of stringent selective pressure. The use of vectors containing LCRs from highly expressed gene domains, using cells derived from the appropriate tissue, greatly simplifies the procedure, giving a large proportion of clonal cell lines showing stable high levels of expression (Needham M, Gooding C, Hudson K, Antoniou M, Grosfeld F and Hollis M. Nucleic Acids Res 20, 997-1003 (1992); Needham M, Egerton M, Millest A, Evans S, Popplewell M, Cerillo G, McPheat J, Monk A, Jack A, Johnstone D and Hollis M. Protein Expr Purif 6, 124-131 (1995).
However, the tissue-specificity of LCRs, although useful in some circumstances, is also a major limitation for many applications, for instance where no LCR is known for the tissue in which expression is required, or where expression in many, or all, tissues is required.
Our co-pending patent applications PCT/GB99/02357 (WO 00/05393), U.S. Ser. No. 09/358,082, GB 0022995.5 and U.S. 60/252,048 incorporated by reference herein, describe elements that are responsible, in their natural chromosomal context, for establishing an open chromatin structure across a locus that consists exclusively of ubiquitously expressed, housekeeping genes. These elements are not derived from an LCR and comprise extended methylation-free CpG islands. We have used the term Ubiquitous Chromatin Opening Element (UCOE) to describe such elements.
In mammalian DNA, the dinucleotide CpG is recognised by a DNA methyltransferase enzyme that methylates cytosine to 5-methylcytosine. However, 5-methylcytosine is unstable and is converted to thymine. As a result, CpG dinucleotides occur far less frequently than one would expect by chance. Some sections of genomic DNA nevertheless do have a frequency of CpG that is closer to that expected, and these sequences are known as “CpG islands”. As used herein a “CpG island” is defined as a sequence of DNA, of at least 200 bp, that has a GC content of at least 50% and an observed/expected CpG content ratio of at least 0.6 (i.e. a CpG dinucleotide content of at least 60% of that which would be expected by chance) (Gardiner-Green M and Frommer M. J Mol Biol 196, 261-282 (1987); Rice P, Longden I and Bleasby A Trends Genet 16, 276-277 (2000).
Methylation-free CpG islands are well-known in the art (Bird et al (1985) Cell 40: 91-99, Tazi and Bird (1990) Cell 60: 909-920) and may be defined as CpG islands where a substantial proportion of the cytosine residues are not methylated and which usually extend over the 5′ ends of two closely spaced (0.1-3 kb) divergently transcribed genes. These regions of DNA are reported to remain hypomethylated in all tissues throughout development (Wise and Pravtcheva (1999) Genomics 60: 258-271). They are often associated with the 5′ ends of ubiquitously expressed genes, as well as an estimated 40% of genes showing a tissue-restricted expression profile (Antequera, F. & Bird, A. Proc. Natl. Acad. Sci. USA 90, 1195-11999 (1993); Cross, S. H. & Bird, A. P. Curr. Opin, Genet. Dev. 5, 309-314 (1995) and are known to be localised regions of active chromatin (Tazi, J. & Bird, A. Cell 60, 909-920 (1990).
An ‘extended’ methylation-free CpG island is a methylation-free CpG island that extends across a region encompassing more than one transcriptional start site and/or extends for more than 300 bp and preferably more than 500 bp. The borders of the extended methylation-free CpG island are functionally defined through the use of PCR over the region in combination with restriction endonuclease enzymes whose ability to digest (cut) DNA at their recognition sequence is sensitive to the methylation status of any CpG residues that are present. One such enzyme is HpaII, which recognises and digests at the site CCGG, which is commonly found within CpG islands, but only if the central CG residues are not methylated. Therefore, PCR conducted with HpaII-digested DNA and over a region harbouring HpaII sites, does not give an amplification product due to HpaII digestion if the DNA is unmethylated. The PCR will only give an amplified product if the DNA is methylated. Therefore, beyond the methylation-free region HpaII will not digest the DNA a PCR amplified product will be observed thereby defining the boundaries of the “extended methylation-free CpG island”.
We have demonstrated (WO 00/05393) that regions spanning methylation-free CpG islands encompassing dual, divergently transcribed promoters from the human TATA binding protein (TBP)/proteosome component-B1 (PSMBI) and heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNPA2)/heterochromatin protein 1Hsγ (HP1Hsγ) gene loci give reproducible, physiological levels of gene expression and that they are able to prevent a variegated expression pattern and silencing that normally occurs with transgene integration within centromeric heterochromatin.
As used herein, the term “reproducible expression” means that the polynucleotide of the invention will direct expression of the expressible gene at substantially the same level of expression irrespective of its chromatin environment and preferably irrespective of the cell type or tissue type in which the polynucleotide of the invention may be. Those of skill in the art will recognize that substantially the same level of expression of the operably-linked expressible gene is achieved, irrespective of the chromatin environment of the claimed polynucleotide, and preferably irrespective of the cell type, assuming that the cell is capable of active gene expression.
We have shown (WO 00/05393) that methylation-free CpG islands associated with actively transcribing promoters possess the ability to remodel chromatin and are thus thought to be a prime determinant in establishing and maintaining an open domain at housekeeping gene loci.
UCOEs confer an increased proportion of productive gene delivery events with improvements in the level and stability of transgene expression. This has important research and biotechnological applications including the generation of transgenic animals and recombinant protein products in cultured cells. We have shown (WO 00/05393) beneficial effects of UCOEs on expression of the CMV-EGFP reporter construct and with the secreted, pharmaceutically valuable protein erythropoietin. The properties of UCOEs also suggest utility in gene therapy, the effectiveness of which is often limited by a low frequency of productive gene delivery events and an inadequate level and duration of expression (Verma, I. M. & Somia, N. Nature 389: 239-242 (1997).
Our aforementioned application PCT/GB99/02357 (WO 00/05393), discloses functional UCOE fragments of approximately 4.0 kb, in particular, the ‘5.5 RNP’ fragment defined by nucleotides 4102 to 8286 of FIG. 21 (as disclosed on p11, lines 6 and 7). The same application discloses a ‘1.5 kb RNP’ fragment (FIGS. 22 and 29, derivation described on p51, lines 1 to 5). However, this fragment is actually a 2165 bp BamHI-Tth111I fragment of the ‘5.5 RNP’ fragment described above, consisting of nucleotides 4102 to 6267 of FIG. 21 of that application.
In a further application (WO 02/24930), we disclose artificially-constructed UCOEs composed of fragments of naturally-occurring CpG islands. The fragments disclosed are larger than those claimed in the current application and it was not, at that time, considered possible to use small fragments individually, rather than as mere components of synthetic or ‘hybrid’ UCOE constructs.
Given these significant implications and wide ranging applications, there is a desire to further optimise transgene expression levels. There is a need to further optimise the levels of transgene expression, particularly in the fields of in vivo gene therapy and for in vitro production of recombinant proteins.
One particular need is to reduce the size of elements used to enhance gene expression. By so doing, smaller vectors may be produced, or vectors with a greater capacity in terms of the size of insert they may stably contain and express.