The current model of chromatin structure in higher eukaryotes postulates that genes are organised in “domains” (Dillon, N. & Grosveld, F. Chromatin domains as potential units of eukaryotic gene function. Curr. Opin. Genet. Dev. 4, 260-264 (1994); Higgs, D. R. Do LCRs open chromatin domains? Cell 95, 299-302 (1998), each of which is incorporated herein by reference). Chromatin domains are envisaged to exist in either a condensed, “closed”, transcriptionally silent state, or in a de-condensed, “open” and transcriptionally competent configuration. The establishment of an open chromatin structure characterized by increased DNaseI sensitivity, DNA hypomethylation and histone hyperacetylation, is considered a pre-requisite to the commencement of gene expression.
The open and closed nature of chromatin regions is reflected in the behaviour of transgenes that are randomly integrated into the host cell genome. Identical constructs give different patterns of tissue-specific and development stage-specific expression when integrated at different locations in the mouse genome (Palmiter, R. D. & Brinster, R. L. Ann. Ref. Genet. 20, 465-499 (1986); Allen, N. D. et al. Nature 333, 852-855 (1988); Bonnerot, C., Grimber, G., Briand, P. & Nicolas, J. F. Proc. Natl. Acad. Sci. USA 87:6331-6335 (1990), each of which is incorporated herein by reference).
A variegated expression pattern within a given transgenic mouse tissue, known as position effect variegation (PEV), is also frequently observed (Kioussis, D. & Festenstein, R. Curr. Opin. Genet. Dev. 7, 614-619 (1997), which is incorporated herein by reference). When exogenous genes are integrated into the chromosome of mammalian cells cultures in vitro, many of the integration events result in rapid silencing of the transgene and the remainder give large variability in expression levels (Pikaart, M. J., Recillas-Targa, F. & Felsenfield, G. Genes Dev. 12, 2852-2862 (1998); Fussenegger, M., Bailey, J. E., Hauser, H. & Mueller, P. P Trends Biotech. 17, 35-42 (1999), each of which is incorporated herein by reference). These position effects render transgene expression inefficient, with implication for both basic research and biotechnology applications.
The chromatin domain model of gene organization suggests that genetic control elements that are able to establish and maintain a transcriptionally competent open chromatin structure should be associated with active regions of the genome.
Locus Control Regions (LCRs) are a class of transcriptional regulatory elements with long-range chromatin remodelling capability. LCRs are functionally defined in transgenic mice by their ability to confer site-of-integration independent, transgene copy number-dependent, physiological levels of expression on a gene linked in cis, especially single copy transgenes (Fraser, P. & Grosveld, F., Curr. Opin. Cell Biol. 10, 361-365 (1998); Li, Q., Harju, S. & Peterson, K. R., Trends Genet. 15: 403-408 (1999), each of which is incorporated herein by reference). Crucially, such expression is tissue-specific. LCRs are able to obstruct the spread of heterochromatin, prevent PEV (Kioussis, D. & Festenstein, R. Curr. Opin. Genet. Dev. 7, 614-619 (1997), which is incorporated herein by reference) and consist of a series of DNase I hypersensitive (HS) sites which can be located either 5′ or 3′ of the genes that they regulate (Li, Q., Harju, S. & Peterson, K. R. Trends Genet. 15: 403-408 (1999), which is incorporated herein by reference).
LCRs appear to be comprised of two separate, although not necessarily independent components. First, the establishment of an ‘open chromatin domain’, and second a dominant transcriptional activation capacity to confer transgene copy number dependent expression (Fraser, P. & Grosveld, F. Curr. Opin. Cell Biol. 10, 361-365 (1998), which is incorporated herein by reference). The molecular mechanisms by which LCRs exert their function remain a point of contention (Higgs, D. R. Cell 95, 299-302 (1998); Bulger, M. & Groudine, M. Genes Dev. 13, 2465-2477 (1999); Grosveld, F. Curr. Opin. Genet. Dev. 9 152-157 (1999); Bender, M. A., Bulger, M., Close, J. & Groudine, M., Mol. Cell 5, 387-393 (2000), each of which is incorporated herein by reference).
The generation of cultured mammalian cell lines producing high levels of a therapeutic protein product is a major developing industry. Chromatin position effects make it a difficult, time consuming and expensive process. The most commonly used approach to the production of such mammalian “cell factories” relies on gene amplification induced by a combination of a drug resistance gene (e.g., DHFR, glutamine synthetase (Kaufman R J. Methods Enzymol 185, 537-566 (1990), which is incorporated herein by reference), and the maintenance of stringent selective pressure. The use of vectors containing LCRs from highly expressed gene domains, using cells derived from the appropriate tissue, greatly simplifies the procedure, giving a large proportion of clonal cell lines showing stable high levels of expression (Needham M, Gooding C, Hudson K, Antoniou M, Grosfeld F and Hollis M. Nucleic Acids Res 20, 997-1003 (1992); Needham M, Egerton M, Millest A, Evans S, Popplewell M, Cerillo G, McPheat J, Monk A, Jack A, Johnstone D & Hollis M. Protein Expr Purif 6, 124-131 (1995), each of which is incorporated herein by reference).
However, the tissue-specificity of LCRs, although useful in some circumstances, is also a major limitation for many applications, for instance where no LCR is known for the tissue in which expression is required, or where expression in many, or all, tissues is required.
Our co-pending patent applications PCT/GB99/02357 (WO 00/05393), U.S. Ser. No. 09/358,082, GB 0022995.5 and U.S. 60/252,048, each of which is incorporated herein by reference, describe elements that are responsible, in their natural chromosomal context, for establishing an open chromatin structure across a locus that consists exclusively of ubiquitously expressed, housekeeping genes. These elements are not derived from an LCR and comprise extended methylation-free CpG islands. We have used the term Ubiquitous Chromatin Opening Element (UCOE) to describe such elements.
In mammalian DNA, the dinucleotide CpG is recognised by a DNA methyltransferase enzyme that methylates cytosine to 5-methylcytosine. However, 5-methylcytosine is unstable and is converted to thymine. As a result, CpG dinucleotides occur far less frequently than one would expect by chance. Some sections of genomic DNA nevertheless do have a frequency of CpG that is closer to that expected, and these sequences are known as “CpG islands”. As used herein a “CpG island” is defined as a sequence of DNA, of at least 200 bp, that has a GC content of at least 50% and an observed/expected CpG content ratio of at least 0.6 (i.e., a CpG dinucleotide content of at least 60% of that which would be expected by chance) (Gardiner-Green M and Frommer M. J Mol Biol 196, 261-282 (1987); Rice P, Longden I and Bleasby A Trends Genet 16, 276-277 (2000), each of which is incorporated herein by reference).
Methylation-free CpG islands are well-known in the art (Bird et al. (1985) Cell 40: 91-99; Tazi & Bird (1990) Cell 60: 909-920, each of which is incorporated herein by reference) and may be defined as CpG islands where a substantial proportion of the cytosine residues are not methylated and which usually extend over the 5′ ends of two closely spaced (0.1-3 kb) divergently transcribed genes. These regions of DNA are reported to remain hypomethylated in all tissues throughout development (Wise and Pravtcheva (1999) Genomics 60: 258-271, which is incorporated herein by reference). They are often associated with the 5′ ends of ubiquitously expressed genes, as well as an estimated 40% of genes showing a tissue-restricted expression profile (Antequera, F. & Bird, A. Proc. Natl. Acad. Sci. USA 90, 1195-11999 (1993); Cross, S. H. & Bird, A. P. Curr. Opin, Genet. Dev. 5, 309-314 (1995), each of which is incorporated herein by reference), and are known to be localized regions of active chromatin (Tazi, J. & Bird, A. Cell 60, 909-920 (1990), which is incorporated herein by reference).
An ‘extended’ methylation-free CpG island is a methylation-free CpG island that extends across a region encompassing more than one transcriptional start site and/or extends for more than 300 bp and preferably more than 500 bp. The borders of the extended methylation-free CpG island are functionally defined through the use of PCR over the region in combination with restriction endonuclease enzymes whose ability to digest (cut) DNA at their recognition sequence is sensitive to the methylation status of any CpG residues that are present. One such enzyme is HpaII, which recognises and digests at the site CCGG, which is commonly found within CpG islands, but only if the central CG residues are not methylated. Therefore, PCR conducted with HpaII-digested DNA and over a region harboring HpaII sites, does not give an amplification product due to HpaII digestion if the DNA is unmethylated. The PCR will only give an amplified product if the DNA is methylated. Therefore, beyond the methylation-free region HpaII will not digest the DNA a PCR amplified product will be observed thereby defining the boundaries of the “extended methylation-free CpG island”.
We have demonstrated (WO 00/05393, which is incorporated herein by reference) that regions spanning methylation-free CpG islands encompassing dual, divergently transcribed promoters from the human TATA binding protein (TBP)/proteosome component-B1 (PSMBI) and heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNPA2)/heterochromatin protein 1Hsγ (HP1Hsγ) gene loci give reproducible, physiological levels of gene expression and that they are able to prevent a variegated expression pattern and silencing that normally occurs with transgene integration within centromeric heterochromatin.
As used herein, the term “reproducible expression” means that the polynucleotide of the invention will direct expression of the expressible gene at substantially the same level of expression irrespective of its chromatin environment and preferably irrespective of the cell type or tissue type in which the polynucleotide of the invention may be. Those of skill in the art will recognize that substantially the same level of expression of the operably-linked expressible gene is achieved, irrespective of the chromatin environment of the claimed polynucleotide, and preferably irrespective of the cell type, assuming that the cell is capable of active gene expression.
We have shown (WO 00/05393, incorporated herein by reference) that methylation-free CpG islands associated with actively transcribing promoters possess the ability to remodel chromatin and are thus thought to be a prime determinant in establishing and maintaining an open domain at housekeeping gene loci.
UCOEs confer an increased proportion of productive gene delivery events with improvements in the level and stability of transgene expression. This has important research and biotechnological applications including the generation of transgenic animals and recombinant protein products in cultured cells. We have shown (WO 00/05393, incorporated herein by reference) beneficial effects of UCOEs on expression of the CMV-EGFP reporter construct and with the secreted, pharmaceutically valuable protein erythropoietin. The properties of UCOEs also suggest utility in gene therapy, the effectiveness of which is often limited by a low frequency of productive gene delivery events and an inadequate level and duration of expression (Verma, I. M. & Somia, N. Nature 389: 239-242 (1997), which is incorporated herein by reference).
Given these significant implications and wide ranging applications, there is a desire to further optimize transgene expression levels. There is a need to further increase the levels of expression obtainable by the use of a UCOE alone, particularly in the fields of in vivo gene therapy and for in vitro production of recombinant proteins.
The expression of a nucleic acid operably linked to a 5′ UCOE may surprisingly be further increased by the presence of a selectable element 3′ to the expressed nucleic acid, so that the expressible nucleic acid sequence is flanked by a 5′ UCOE and a 3′ selectable marker.
A selectable element that performs more than one function in a vector, such as providing a selectable marker as well as increasing expression of an operably linked gene, allows construction of more compact and efficient expression vectors.
Mei, Kothary and Wall (Mei, Q, Kothary, R. & Wall, L. Exp Cell Research 260, 304-312 (2000), which is incorporated herein by reference) disclose constructs comprising an expressible gene (β-globin) operably linked to an LCR and a pgk/puromycin resistance element. However, this work teaches that it is the combination of an expressible gene, and LCR and a tk/neomycin resistance element that is important in imposing position effects on gene expression, with the pgk/puromycin resistance element being used as a negative control. This paper teaches away from any beneficial effect being gained from the use of a pgk/puromycin resistance element. The paper does not disclose constructs comprising an extended unmethylated CpG island (or UCOE), an expressible gene and a pgk/puromycin resistance element, since the constructs comprise LCRs. Similarly, the paper does not disclose an expressible gene operably linked to a promoter with which it is not naturally linked, also operably linked to a pgk/puromycin resistance element, since in each case the β-globin gene is expressed under control of its endogenous promoter.
Artelt et al. compare the influence of neomycin and puromycin resistance genes on cis-linked genes in eukaryotic expression vectors (Artelt P, Grannemann R, Stocking C, Friel J, Bartsch J and Hauser H Gene 99, 249-254 (1991), which is incorporated herein by reference). They conclude that neomycin resistance genes may have a silencing effect on linked genes, but that “the gene conferring resistance to puromycin from Streptomyces alboniger does not influence adjacent promoters.” Accordingly, there is nothing in Artelt et al. that discloses or suggests the importance of the position or spacing use of resistance genes as disclosed in the present application.
Our co-pending patent applications PCT/GB99/02357 (WO 00/05393), U.S. Ser. No. 09/358,082, GB 0022995.5 and U.S. 60/252,048 (each of which is incorporated herein by reference) disclose polynucleotides and vectors comprising extended, methylation-free CpG islands operably linked to expressible nucleic acids with antibiotic resistance genes. However, in the examples disclosed, the antibiotic gene is not adjacent and 3′ to the expressible nucleic acid. The surprising contribution of such an adjacent selectable marker is likewise not disclosed or implied.