The present invention relates to polynucleotides comprising elements conferring improved expression on operably-linked transcription units. These elements are naturally associated with the promoter regions of ribosomal protein genes and, in recombinant DNA constructs, confer high and reproducible levels of gene expression. The present invention also relates to vectors comprising such polynucleotide sequences, host cells comprising such vectors and use of such polynucleotides, vectors or host cells in therapy, for production of recombinant proteins in cell culture and other biotechnological applications.
The current model of chromatin structure in higher eukaryotes postulates that genes are organised in “domains” (Dillon, N. & Grosveld, F. Chromatin domains as potential units of eukaryotic gene function. Curr. Opin. Genet. Dev. 4, 260-264 (1994); Higgs, D. R. Do LCRs open chromatin domains? Cell 95, 299-302 (1998)) Chromatin domains are envisaged to exist in either a condensed, “closed”, transcriptionally silent state, or in a de-condensed, “open” and transcriptionally competent configuration. The establishment of an open chromatin structure characterised by increased DNasel sensitivity, DNA hypomethylation and histone hyperacetylation, is considered a pre-requisite to the commencement of gene expression.
The open and closed nature of chromatin regions is reflected in the behaviour of transgenes that are randomly integrated into the host cell genome. Identical constructs give different patterns of tissue-specific and development stage-specific expression when integrated at different locations in the mouse genome (Palmiter, R. D. & Brinster, R. L. Ann. Ref. Genet. 20, 465-499 (1986); Allen, N. D. et al. Nature 333, 852-855 (1988); Bonnerbt, C., Grimber, G., Briand, P. & Nicolas, J. F. Proc. Natl. Acad. Sci. USA 87:6331-6335 (1990)).
The chromatin domain model of gene organisation suggests that genetic control elements that are able to establish and maintain a transcriptionally competent open chromatin structure should be associated with active regions of the genome.
Locus Control Regions (LCRs) are a class of transcriptional regulatory elements with long-range chromatin remodelling capability. LCRs are functionally defined in transgenic mice by their ability to confer site-of-integration independent, transgene copy number-dependent, physiological levels of expression on a gene linked in cis, especially single copy transgenes Fraser, P. & Grosveld, F. Curr. Opin. Cell Biol. 10, 361-365 (1998); Li, Q., Harju, S. & Peterson, K. R. Trends Genet. 15: 403-408 (1999). Crucially, such expression is tissue-specific. LCRs are able to obstruct the spread of heterochromatin, prevent PEV (Kioussis, D. & Festenstein, R. Curr. Opin. Genet. Dev. 7, 614-619 (1997)) and consist of a series of DNase I hypersensitive (HS) sites which can be located either 5′ or 3′ of the genes that they regulate (Li, Q., Harju, S. & Peterson, K. R. Trends Genet. 15:403-408 (1999)).
The generation of cultured mammalian cell lines producing high levels of a therapeutic protein product is a major developing industry. Chromatin position effects make it a difficult, time consuming and expensive process. The most commonly used approach to the production of such mammalian “cell factories” relies on gene amplification induced by a combination of a drug resistance gene (e.g., DHFR, glutamine synthetase (Kaufman R J. Methods Enzymol 185, 537-566 (1990)). and the maintenance of stringent selective pressure. The use of vectors containing LCRs from highly expressed gene domains, using cells derived from the appropriate tissue, greatly simplifies the procedure, giving a large proportion of clonal cell lines showing stable high levels of expression (Needham M, Gooding C, Hudson K, Antoniou M, Grosveld F and Hollis M. Nucleic Acids Res 20, 997-1003 (1992); Needham M, Egerton M, Millest A, Evans S, Popplewell M, Cerillo G, McPheat J, Monk A, Jack A, Johnstone D and Hollis M. Protein Expr Purif 6,124-131 (1995).
However, the tissue-specificity of LCRs, although useful in some circumstances, is also a major limitation for many applications, for instance where no LCR is known for the tissue in which expression is required, or where expression in many, or all, tissues is required.
U.S. Pat. No. 6,689,606 and co-pending patent application WO 00/0539, incorporated by reference herein, describe elements that are responsible, in their natural chromosomal context, for establishing an open chromatin structure across a locus that consists exclusively of ubiquitously expressed, housekeeping genes. These elements are not derived from an LCR and comprise extended methylation-free CpG islands.
In mammalian DNA, the dinucleotide CpG is recognised by a DNA methyltransferase enzyme that methylates cytosine to 5-methylcytosine. However, 5-methylcytosine is unstable and is converted to thymine. As a result, CpG dinucleotides occur far less frequently than one would expect by chance. Some sections of genomic DNA nevertheless do have a frequency of CpG that is closer to that expected, and these sequences are known as “CpG islands”. As used herein a “CpG island” is defined as a sequence of DNA, of at least 200 bp, that has a GC content of at least 50% and an observed/expected CpG content ratio of at least 0.6 (i.e. a CpG dinucleotide content of at least 60% of that which would be expected by chance) (Gardiner-Green M and Frommer M. J Mol Biol 196, 261-282 (1987); Rice P, Longden I and Bleasby A Trends Genet 16, 276-277 (2000).
Methylation-free CpG islands are well-known in the art (Bird et al (1985) Cell 40: 91-99, Tazi and Bird (1990) Cell 60: 909-920) and may be defined as CpG islands where a substantial proportion of the cytosine residues are not methylated and which usually extend over the 5′ ends of two closely spaced (0.1-3 kb) divergently transcribed genes. These regions of DNA are reported to remain hypomethylated in all tissues throughout development (Wise and Pravtcheva (1999) Genomics 60: 258-271). They are often associated with the 5′ ends of ubiquitously expressed genes, as well as an estimated 40% of genes showing a tissue-restricted expression profile (Antequera, F. & Bird, A. Proc. Natl. Acad. Sci. USA 90, 1195-11999 (1993); Cross, S. H. & Bird, A. P. Curr. Opin, Genet. Dev. 5, 309-314 (1995) and are known to be localised regions of active chromatin (Tazi, J. & Bird, A. Cell 60, 909-920 (1990).
An ‘extended’ methylation-free CpG island is a methylation-free CpG island that extends across a region encompassing more than one transcriptional start site and/or extends for more than 300 bp and preferably more than 500 bp. The borders of the extended methylation-free CpG island are functionally defined through the use of PCR over the region in combination with restriction endonuclease enzymes whose ability to digest (cut) DNA at their recognition sequence is sensitive to the methylation status of any CpG residues that are present. One such enzyme is HpaII, which recognises and digests at the site CCGG, which is commonly found within CpG islands, but only if the central CG residues are not methylated. Therefore, PCR conducted with HpaII-digested DNA and over a region harbouring HpaII sites, does not give an amplification product due to HpaII digestion if the DNA is unmethylated. The PCR will only give an amplified product if the DNA is methylated. Therefore, beyond the methylation-free region HpaII will not digest the DNA a PCR amplified product will be observed thereby defining the boundaries of the “extended methylation-free CpG island”.
It has been shown (WO 00/05393) that regions spanning methylation-free CpG islands encompassing dual, divergently transcribed promoters from the human TATA binding protein (TBP)/proteosome component-B1 (PSMBI) and heterogeneous nuclear ribonucleoprotein A2/B1 (hnRNPA2)/heterochromatin protein 1 Hsγ (HP1Hsγ) gene loci give reproducible, physiological levels of gene expression and that they are able to prevent a variegated expression pattern and silencing that normally occurs with transgene integration within centromeric heterochromatin.
It is known that methylation-free CpG islands associated with actively transcribing promoters possess the ability to remodel chromatin and are thus thought to be a prime determinant in establishing and maintaining an open domain at housekeeping gene loci (WO 00/05393) and that such elements confer an increased proportion of productive gene delivery events with improvements in the level and stability of transgene expression.
Ribosomes are large RNA and protein complexes responsible for the translation of mRNA into polypeptides. Each ribosome is comprised of 4 ribosomal RNA (rRNA) molecules and large number of ribosomal proteins (currently thought to be 79 in mammalian cells). Ribosomal proteins have functions including facilitation of rRNA folding, protection from cellular ribonucleases, and coordinating protein synthesis. Some ribosomal proteins have additional extraribosomal functions (Wool, 1996, TIBS 21: 164-165). Given the structural and functional similarities of ribosomes across species, it is unsurprising that the amino acid sequence conservation of ribosomal proteins is high, and among mammals the sequences of most ribosomal proteins are almost identical (Wool et al, 1995, Biochem Cell Biol 73: 933-947.
Two ribosomal proteins appear atypical in that they are expressed in the form of propeptides (carboxy-extension proteins) fused to ubiquitin. Ubiquitin is a highly conserved 76-residue polypeptide involved in a variety of cellular functions, including the regulation of intracellular protein breakdown, cell cycle regulation and stress response (Hershko & Ciechanover, 1992, Annu Rev Biochem 61: 761-807; Coux et al, 1996, Annu Rev Biochem 65: 801-847).
Ubiquitin is encoded by two distinct classes of gene. One is a poly-ubiquitin gene encoding a linear polymer of ubiquitin repeats. The other comprises genes encoding natural fusion proteins in which a single ubiquitin molecule is linked to the ribosomal protein rps27A or rpL40 (Finley et al, 1989, Nature 338: 394-401; Chan et al, 1995, Biochem Biophys Res Commun 215: 682-690; Redman & Burris, 1996, Biochem J 315: 315-321).
The common structural features of ribosomal protein promoters are discussed by Perry (2005, BMC Evolutionary Biology 5:15). The promoters may be classified according to the nature of the TATA box motifs, number and type of transcription factor binding sites and location of AUG start codons. However, such classification does not appear to predict promoter strength and evidence suggests that several such promoters tested have equivalent transcriptional activity as measured by expression of a linked reporter gene (Hariharan et al, 1989, Genes Dev 3: 1789-800).
U.S. Pat. No. 6,063,598 discloses the hamster-ubiquitin/S27a promoter its use to drive high level production of recombinant proteins. However, there is no suggestion of its use to enhance the expression of a gene primarily transcribed from a further promoter (i.e one other than hamster-ubiquitin/S27a promoter).
US application US 2004/0148647 discloses a reporter assay using an expression vector comprising a hamster ubiquitin /S27A promoter functionally linked to a gene for a product of interest and a fluorescent protein reporter. Again, the application only discloses constructs in which transcription of gene of interest is from the hamster-ubiquitin/ S27a promoter itself.
It remains an objective in the field of recombinant gene expression to obtain higher and more reliable levels of expression, particularly for in vivo and ex vivo therapeutic applications and for in vitro recombinant protein production.