The present invention relates to a polynucleotide comprising a ubiquitous chromatin opening element (UCOE) which is not derived from an LCR. The present invention also relates to a vector comprising the polynucleotide sequence, a host cell comprising the vector, use of the polynucleotide, vector or host cell in therapy and in an assay, and a method of identifying UCOEs.
The current model of chromatin structure in higher eukaryotes postulates that genes are organised in xe2x80x9cdomainsxe2x80x9d (Dillon and Grosveld, 1994). Chromatin domains can consist of groups of genes that are expressed in a strictly tissue specific manner such as the human xcex2-globin family (Grosveld et al., 1993), genes that are expressed ubiquitously such as the human TBP/C5 locus (Trachtulec, Z. et al., 1997), or a mixture of tissue specific and ubiquitously expressed genes such as murine xcex3/xcex4 TCR/dad-1 locus, (Hong et al., 1997; Ortiz et al., 1997) and the human xcex1-globin locus, (Vyas et al., 1992). Genes with two different tissue specificities may also be closely linked. For example, the human growth hormone and chorionic somatomammotropin genes (Jones et al., 1995). Chromatin domains are envisaged to exist in either a closed, xe2x80x9ccondensedxe2x80x9d, transcriptionally silent state or in a xe2x80x9cde-condensedxe2x80x9d, open and transcriptionally competent configuration. The establishment of an open chromatin structure characterised by DNase I sensitivity, DNA hypomethylation and histone hyperacetylation, is seen as a pre-requisite to the commencement of gene expression.
The discovery of tissue-specific transcriptional regulatory elements known as locus control regions (LCRs) has provided novel insights into the mechanisms by which a transcriptionally competent, open chromatin domain is established and maintained in certain cases. LCRs are defined by their ability to confer on a gene linked in cis host cell type-restricted, integration site independent, copy number-dependent expression of the gene (Grosveld et al., 1987; Lang et al., 1988; Greaves et al., 1989; Diaz et al., 1994; Carson and Wiles, 1993; Bonifer et al., 1990; Montoliu et al., 1996; Raguz et al., 1998; EP-A-0 332 667) especially as single copy transgenes (Ellis et al., 1996; Raguz et al., 1998). LCRs are able to obstruct the spread of heterochromatin and prevent position effect variegation (Festenstein et al., 1996; Milot et al., 1996). This pattern of expression conferred by LCRs suggests that these elements possess a powerful chromatin remodelling capability and are able to establish and maintain a transcriptionally competent, open chromatin domain. In addition, LCRs have been found to possess an inherent transcriptional activating capability that allows them to confer tissue-specific gene expression independent of their cognate promoter (Blom van Assendelft et al., 1989; Collis et al., 1990; Antoniou and Grosveld, 1990; Greaves et al., 1989).
All LCRs are associated with gene domains with a prominent tissue-specific or tissue restricted component and are associated with a series of DNase I hypersensitive sites which can be located either 5xe2x80x2 (Grosveld et al., 1987; Carson and Wiles, 1993; Bonifer et al., 1994; Jones et al., 1995; Montoliu et al., 1996) or 3xe2x80x2 (Greaves et al., 1989) of genes which they regulate. In addition, LCR elements have recently been found to exist between closely spaced genes (Hong et al., 1997; Ortiz et al., 1997). An LCR-like element has also been reported to have an intronic location within a gene (Aronow et al., 1995). In the few cases that have been investigated, these elements correspond to large clusters of tissue-specific and ubiquitous transcription factor binding sites (Talbot et al., 1990; Philipsen et al., 1990; Pruzina et al., 1991; Lake et al., 1990; Jarman et al., 1991; Aronow et al., 1995).
The discovery of LCRs suggests that the regulatory elements that control tissue-specific gene expression from a given chromatin domain are organised in a hierarchical fashion. The LCR would appear to act as a master switch wherein its activation results in the establishment of an open chromatin structure that has to precede any gene expression. Transcription at the physiologically required level can then be achieved through a direct chromatin interaction between the LCR and the local promoter and enhancer elements of an individual gene via looping out of the intervening DNA (Hanscombe et al., 1991; Wijgerde et al., 1995; Dillon et al., 1997).
As indicated above, an essential feature of an LCR is its tissue specificity. The tissue specificity of an LCR has been investigated by Ortiz et al., (1997), wherein a number of DNase I hypersensitive sites of the T-cell receptor alpha (TCRxcex1) LCR were deleted and an LCR derived element, which opens chromatin in a number of tissues identified. Talbot et al., (1994, NAR, 22, 756-766) describe an LCR-like element that is considered to allow expression of a linked gene in a number of tissues. However, reproducible expression of the linked gene is not obtained. The levels of expression are indicated as having a standard deviation of between 74% from the average value on a per-gene-copy basis where the gene is expressed where transgene copy number is 3 or more. When the copy number is 1 or 2, the gene expression levels are 10 times lower and have a standard deviation of 49% from the average value on a per-gene-copy basis where the gene is expressed. The element disclosed by Talbot et al., does not give reproducible expression of a linked gene. This and the high variability of the system clearly limits the use of this system.
The long-term correction of genetically inherited disorders by gene therapy requires the maintenance and sustained expression of the transcription unit at sufficiently high levels to be of therapeutic value. This, may be achieved by one of two approaches. Firstly, transcription units can be stably integrated into the host cell genome using, for example, retroviral (Miller, 1992; Miller et al., 1993) or adeno-associated viral (AAV) vectors (Muzyczka, 1992; Kotin, 1994; Flotte and Carter, 1995). Alternatively, therapeutic genes can be incorporated within self-replicating episomal vectors comprising viral origins of replication such as those from EBV (Yates et al., 1985), human papovavirus BK (De Benedetti and Rhoads, 1991; Cooper and Miron, 1993) and BPV-1 (Piirsoo et al., 1996).
Unfortunately, the level of expression that is normally seen from genes that are integrated into the genome is too low or short in duration to be of therapeutic value in most cases. This is due to what are generally known as xe2x80x9cposition effectsxe2x80x9d. The transcription of the introduced gene is dependent upon its site of integration where it comes under the influence of either competing activating (promoters/enhancers) or more frequently, repressing (chromatin silencing) elements. Position effects continue to impose substantial constraints on the therapeutic efficacy of integrating virus-based vectors of retroviral and adeno-associated viral (AAV) origin. Viral transcriptional regulatory elements are notoriously susceptible to silencing by chromatin elements in the vicinity of integration sites. The inclusion of classical promoter and enhancer elements from highly expressed genes as part of the viral constructs has not solved this major problem (Dai et al., 1992; Lee et al., 1993).
The inclusion of a fully functional LCR as part of the transcription unit overcomes this deficiency since this element can be used to drive a predictable, physiological and sustained level of expression of the desired gene in a specific cell type (see Yeoman and Mellor, 1992, Brines and Klaus, 1993; Needham et al. 1992 and 1993; Tewari et al., 1998; Zhumabekov et al., 1995). This degree of predictability of expression is vital for a safe and successful gene therapy strategy.
The use of replicating episomal vectors (REVs) offers an attractive alternative to integrating viral vectors for producing long-term gene expression. Firstly, REVs do not pose the same size limitations on the therapeutic transcription unit as do viral vectors, with inserts in excess of 300 kb being a possibility (Sun et al., 1994). Secondly, being episomal, REVs do not suffer from potential hazards associated with insertional mutagenesis that is an inherent problem with integrating viral vectors. Lastly, REVs are introduced into the target cells using non-viral delivery systems that can be produced more cheaply at scale than with viral vectors.
It has been demonstrated that both non-replicating, transiently transfected plasmids (Reeves et al., 1985; Archer et al., 1992) and REVs (Reeves et al., 1985; Smith et al., 1993) assemble nucleosomes. Assembly on REVs is more organised and resembles native chromatin whereas nucleosomes on transient plasmids are less well ordered and may allow some access of transcription factors to target sequences although gene expression can be inhibited (Archer et al, 1992). It has recently been demonstrated that LCRs are able to confer long-term, tissue-specific gene expression from within REVs (International Patent Application WO 98/07876).
The generation of cultured mammalian cell lines producing high levels of a therapeutic protein product is a major developing industry. Chromatin position effects make this a difficult, time consuming and expensive process. The most commonly used approach to the production of such mammalian xe2x80x9ccell factoriesxe2x80x9d relies on gene amplification induced by a combination of a drug resistance gene (e.g. DHFR, glutamine synthetase (Kaufman, 1990)) and high toxic drug concentrations which have to be maintained at all times. The use of vectors possessing LCRs from highly expressed gene domains, greatly simplifies the generation of these cell lines (Needham et al., 1992; Needham et al., 1995).
A problem with the use of LCRs is that they are tissue specific and reproducible expression is only obtained in the specific cell type. Accordingly, one could not obtain reproducible expression in a tissue type or a number of tissue types for which there is no LCR. Accordingly, there is a need for a UCOE, which is not derived from an LCR.
As indicated above, Ortiz et al., (1997) discloses an LCR derived element, which opens chromatin in number of tissues. There are a number of problems with the LCR derived element of Ortiz et al., (1997). In particular, the element has to be carefully constructed using recombinant DNA techniques to contain the necessary regions of the LCR and also the element does not give reproducible levels of expression of a linked gene in cells of different tissues types, especially when the element is at single or low (less than 3) transgene copy number.
Elements comprising bi-directional promoters and methylation-free CpG islands have been disclosed; however, there is no disclosure or indication that the elements opens chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene in cells of at least two different tissue types.
The human Surfeit locus spans approximately 60 kb and is located on chromosome 9q34.2. The locus comprises bi-directional promoters between the SURF5 and SURF3 genes and between the SURF1 and SURF2 genes (Huxley et al., Mol. Cell. Biol., 10 605-614, 1990; Duhig et al., Genomics, 52, 72-78, 1998; Williams et al., Mol. Cell. Biol., 6, 4558-4569, 1986). There is no indication that these regions open chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene in cells of at least two different tissue types.
A bi-directional promoter is also disclosed by Brayton et al., (J. Biol. Chem., 269, 5313-5321, 1994) between the avian GPAT and AIRC genes. Again there is no indication that the region opens chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene in cells of at least two different tissue types.
A bi-directional promoter is disclosed by Ryan et al. (Gene, 196, 9-17, 1997) between the mitochondrial chaperonin 60 and chaperonin 1 genes. Again there is no indication that the region opens chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene in cells of at least two different tissue types.
A bi-directional promoter is also disclosed associated with the murine HTF9 gene. Again there is no indication that the region opens chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene in cells of at least two different tissue types.
Palmiter et al., (PNAS USA, 95, 8428-8430, 1998) and International Patent Application WO 94/13273 disclose an element associated with the metallothionein genes. The element comprises DNase I hypersensitive sites which are not associated with promoters. Furthermore, there is no evidence demonstrating that the element opens chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene in cells of at least two different tissue types.
The use of non-replicating, transiently transfected plasmids to achieve gene expression by transfecting cells is well known. It is also known that only short term expression (generally less than 72 hours) is achieved using non-replicating, transiently transfected plasmids. The short term of expression is generally considered to be due to the breakdown of the plasmid or loss of the plasmid from the cell. In view of this drawback the use of such plasmids is limited.
The present invention provides isolated polynucleotides comprising a UCOE which opens chromatin or maintains chromatin in an open state and facilitates reproducible expression of an operably-linked gene in cells of at least two different tissue types, wherein the polynucleotide is not derived from a locus control region. The isolated polynucleotides according to the invention are preferably greater than about 1.5 kb in length, more preferably greater than about 4 kb in length, when composed of endogenous genomic UCOE sequences. Functional composites of UCOE sequences, however, can be constructed from the endogenous genomic UCOE. Such composites can be less than 1.5 kb in length and are within the scope of the present invention.
A xe2x80x9clocus control regionxe2x80x9d (LCR) is defined as a genetic element which is obtained from a tissue-specific locus of a eukaryotic host cell and which, when linked to a gene of interest and integrated into a chromosome of a host cell, confers tissue-specific, integration site-independent, copy number-dependent expression on the gene of interest. A polynucleotide derived from an LCR can be any part or parts of an LCR. Preferably, a polynucleotide derived from an LCR is any part of an LCR that functions to open chromatin. An LCR is associated with one or more DNase I hypersensitive (HS) sites that are not associated with a promoter and it is preferred that the UCOE does not comprise HS sites that are not associated with a promoter. HS sites are well known to those skilled in the art and can be identified based on the standard techniques, which are described herein.
The term xe2x80x9cfacilitates reproducible expressionxe2x80x9d refers to the capability of the UCOE to facilitate reproducible activation of transcription of the operably-linked gene. The process is believed to involve the ability of the UCOE to render the region of the chromatin encompassing the gene (or at least the transcription factor binding sites) accessible to transcription factors. Reproducible expression preferably means that the polynucleotide when operably-linked to an expressible gene gives substantially the same level of expression of the operably-linked gene irrespective of its chromatin environment and preferably irrespective of the cell tissue type. Preferably, substantially the same level of expression means a level of expression which has a standard deviation from an average value of less than 48%, more preferably less than 40% and most preferably, less than 25% on a per-gene-copy basis. Alternatively, substantially the same level of expression preferably means that the level of expression varies by less than 10 fold, more preferably less than 5 fold and most preferably less than 3 fold on a per gene copy basis. The level of expression is preferably the level of expression measured in a transgenic animal. It is especially preferred that the UCOE facilitates reproducible expression of an operably-linked gene when present at a single or low (less than 3) copy number.
As used herein, xe2x80x9clinkedxe2x80x9d refers to a cis-linkage in which the gene and the UCOE are present in a cis relationship on the same nucleic acid molecule. The term xe2x80x9coperatively linkedxe2x80x9d refers to a cis-linkage in which the gene is subject to expression facilitated by the UCOE.
Open chromatin or chromatin in an open state refers to chromatin in a de-condensed state and is also referred to as euchromatin. Condensed chromatin is also referred to as heterochromatin. As indicated above, chromatin in a closed (condensed) state is transcriptionally silent. Chromatin in an open (de-condensed) state is transcriptionally competent. The establishment of an open chromatin structure is characterised by DNase I sensitivity, DNA hypomethylation and histone hyperacetylation. Standard methods for identifying open chromatin are well known to those skilled in the art and are described in Wu, 1989, Meth. Enzymol., 170, 269-289; Crane-Robinson et al., 1997, Methods, 12, 48-56; Rein et al., 1998, N.A.R., 26, 2255-2264.
The term xe2x80x9cCells of two or more tissue typesxe2x80x9d refers to cells of at least two, preferably at least 4 and more preferably all of the following different tissue types: heart, kidney, lung, liver, gut, skeletal muscle, gonads, spleen, brain and thymus tissue. Preferably, the polynucleotide facilitates reproducible expression non-tissue specifically, i.e. with no tissue specificity. It is further preferred that the polynucleotide of the present invention facilitates reproducible expression in at least 50% and more preferably in all tissue types where active gene expression occurs.
Preferably, the polynucleotide of the present invention facilitates reproducible expression of an operably-linked gene at a physiological level. By physiological level, it is meant a level of gene expression at which expression in a cell, population of cells or a patient exhibits a physiological effect. Preferably, the physiological level is an optimal physiological level depending on the desired result. Preferably, the physiological level is equivalent to the level of expression of an equivalent endogenous gene.
The UCOE of the present invention can be any element, which opens chromatin or maintains chromatin in an open state and facilitates reproducible expression of an operably-linked gene in cells of at least two different tissue types provided it is not derived from an LCR. In a preferred embodiment, the UCOE comprises an extended methylation-free, CpG-island. CpG-islands have an average GC content of approximately 60%, compared with a 40% average in bulk DNA. One skilled in the art can easily identify CpG-islands using standard techniques such as using restriction enzymes specific for C and G sequences. Such techniques are described in Larsen et al., 1992 and Kolsto et al., 1986. An extended methylation-free CpG island is a methylation-free CpG island that extends across a region encompassing more than one transcriptional start site and/or extends for more than 300 bp and preferably more than 500 bp.
Preferably, the UCOE is derived from a sequence that in its natural endogenous position is associated with, more preferably, located adjacent to, a ubiquitously expressed gene. It is further preferred that the UCOE comprises at least one transcription factor binding site. Transcription factor binding sites include promoter sequences and enhancer sequences. Preferably, the UCOE comprises dual or bi-directional promoters that are divergently transcribed. Dual promoters are defined herein as two or more promoters which are independent from each other so that one of the promoters can be activated or deactivated without effecting the other promoter or promoters. A bi-directional promoter is defined herein as a region that can act as a promoter in both directions but cannot be activated or deactivated in one direction only. Preferably, the UCOE comprises dual promoters. Preferably, the UCOE comprises dual or bi-directional promoters that transcribe divergently (i.e. can lead to transcription in opposite directions) and which in their natural endogenous positions are associated with ubiquitously expressed genes. Preferably, the UCOE comprises dual promoters that are transcribe divergently. The UCOE may comprise a heterologous promoter, i.e. a promoter that is not naturally associated with the other sequences of the UCOE. For example, it is possible to use the CMV promoter with the UCOE associated with the hnRNP A2 and the HP1H-xcex3 promoters, which is discussed further below. The present invention therefore also provides a UCOE comprising one or more heterologous promoters. The heterologous promoter or promoters can replace of one or more of the endogenous promoters of the UCOE or can be used in addition to the one or more endogenous promoters of the UCOE. The heterologous promoter may be any promoter including tissue specific promoters such as tumour-specific promoters and ubiquitous promoters. Preferably the heterologous promoter is a substantially ubiquitous promoter and most preferably is the CMV promoter.
Preferably, the UCOE is not the 3725 bp EcoRI fragments comprising the bi-directional promoter of the HpaII tiny fragment (HTF) island HTF9 as described in Lavia et al., EMBO J., 6, 2773-2779, (1987).
Preferably, the UCOE is not the 149 bp MES-1 element located within a 800 bp BamHI genomic fragment located between the murine SURF1 and SURF2 genes of the Surfeit locus (Williams et al., Mol. Cell. Biol, 13, 4784-4792, 1993). Preferably, the UCOE is not the bi-directional promoter located between the SURF5 and the SURF3 genes of the Surfeit locus (Williams et al., Mol. Cell. Biol, 13, 4784-4792, 1993). It is further preferred that the UCOE is not derived from the human surfeit gene locus which spans 60 kb and is located on chromosome 9q34.2 as defined in Duhig et al., Genomics, 52, 72-78, (1998) or the corresponding murine locus (Huxley et al., Mol. Cell. Biol., 10, 605-614, 1990).
Preferably, the UCOE is not the bi-directional promoter region located between the avian GPAT and AIRC genes contained in the 1350 bp SmaI fragment deposited in the GenBank database (accession no. L12533) (Gavalas et al., Mol. Cell. Biol., 13, 4784-4792, 1993) or the corresponding human equivalent (Brayton et al., J. Biol. Chem., 269, 5313-5321, 1994).
Preferably, the UCOE is not the 13894 bp genomic DNA fragment (GenBank accession no. U68562) comprising the rat mitochondrial chaperonin 60 and chaperonin 10 genes. It is also preferred that the UCOE is not the 581 bp fragment containing the bi-directional promoter located in the intergenic region between the rat mitochondrial chaperonin 60 and chaperonin 10 genes (Ryan et al., Gene, 196, 9-17, 1997).
In a preferred embodiment of the present invention, the UCOE is a 44 kb DNA fragment spanning the human TATA binding protein (TBP) gene and 12 kb each of the 5xe2x80x2 and 3xe2x80x2 flanking sequence, or a functional homologue or fragment thereof.
A further preferred embodiment of the present invention, the UCOE is a 60 kb DNA fragment spanning the human hnRNP A2 gene with 30 kb 5xe2x80x2 flanking sequence and 20 kb 3xe2x80x2 flanking sequence, or a functional homologue or fragment thereof. In a further preferred embodiment, the UCOE comprises the sequence of FIG. 21 between nucleotides 1 to 6264 or a functional homologue or fragment thereof. This sequence encompasses the hnRNP A2 promoter (nucleotides 5636 to 6264) and 5.5 kb 5xe2x80x2 flanking sequence comprising the HP1H-xcex3 promoter.
In a further preferred embodiment of the present invention, the UCOE is a 25 kb DNA fragment spanning the human TBP gene with 1 kb 5xe2x80x2 and 5 kb 3xe2x80x2 flanking sequence, or a functional homologue or fragment thereof.
In a further preferred embodiment, the UCOE is a 16 kb DNA fragment spanning the human hnRNP A2 gene with 5 kb 5xe2x80x2 and 1.5 kb 3xe2x80x2 flanking sequence, or a functional homologue or fragment thereof.
In a further preferred embodiment, the UCOE comprises the sequence of FIG. 21 between nucleotides 1 and 5636 (the 5.5 kb 5xe2x80x2 flanking sequence of the hnRNP A2 promoter) and the CMV promoter or a functional homologue or fragment thereof.
In a further preferred embodiment, the UCOE comprises the sequence of FIG. 21 between nucleotides 4102 and 8286 or a functional homologue or fragment thereof. This sequence encompasses both the hnRNP A2 and HP1H-xcex3 promoters.
In a further preferred embodiment, the UCOE comprises the sequence of FIG. 21 between nucleotides 1 and 7627 or a functional homologue or fragment thereof. This sequence encompasses both the hnRNP A2 and HP1H-xcex3 promoters and exon 1 of the hnRNP A2 gene.
In a further preferred embodiment, the UCOE comprises the sequence of FIG. 21 between nucleotides 1 and 9127 or a functional homologue or fragment thereof. This sequence encompasses both the hnRNP A2 and HP1H-xcex3 promoters and the 3xe2x80x2 flanking sequence of the hnRNP A2 promoter up to but not including exon 2 of the hnRNP A2 gene.
It is further preferred that the UCOE of the present invention has the nucleotide sequence of FIG. 20 or FIG. 21, or a functional fragment or homologue thereof.
The term xe2x80x9cfunctional homologues or fragmentsxe2x80x9d as used herein means homologues or fragments, which open chromatin or maintain chromatin in an open state and facilitate reproducible expression of an operably-linked gene. Preferably, the homologues are species homologues corresponding to the identified UCOEs or are homologues associated with other ubiquitously expressed genes. Sequence comparisons can be made between UCOEs in order to identify conserved sequence motifs enabling the identification or synthesis of other UCOEs. Suitable software packages for performing such sequence comparisons are well known to those skilled in the art. A preferred software package for performing sequence comparisons is PCGENE (Intelligenetics, Inc. USA). Functional fragments can be easily identified by methodically generating fragments of known UCOEs and testing for function. The identification of conserved sequence motifs will also assist in the identification of functional fragments, as fragments comprising the conserved sequence motifs will be likely to be functional. Functional homologues also encompass modified UCOEs wherein elements of the UCOE have been replaced by similar elements, such as replacing one or more promoters of a UCOE with different heterologous promoters. As indicated above, the heterologous promoter may be any promoter including tissue specific promoters such as tumour-specific promoters and ubiquitous promoters. Preferably the heterologous promoter is a strong and/or substantially ubiquitous promoter and most preferably is the CMV promoter.
In another embodiment of the present invention, there is provided a method for identifying a UCOE which facilitates reproducible expression of an operably-linked gene in cells of at least two different tissue types, comprising:
1. testing a candidate UCOE by transfecting cells of at least two different tissue types with a vector containing the candidate UCOE operably-linked to a marker gene; and
2. determining if reproducible expression of the marker gene is obtained in the cells of two or more different tissue types.
Preferably, the method for identifying a UCOE of the present invention comprises the additional step of selecting candidate UCOEs that are associated with one or more of: a ubiquitously expressed gene, a dual or bi-directional promoter and an extended methylation-free CpG-island.
Preferably, reproducible expression of the marker gene is determined in cells containing a single copy of the UCOE linked to the marker gene.
The present invention further provides the method of the present invention wherein the candidate UCOE is tested by generating a non-human transgenic animal containing cells comprising a vector containing the candidate UCOE operably-linked to a marker gene and determining if reproducible expression of the marker gene is obtained in the cells of two or more different tissue types. Preferably, the non-human transgenic animal is a F1, or greater, generation non-human transgenic animal. Preferably the non-human transgenic animal is a rodent, more preferably a mouse.
The present invention provides a UCOE derivable from a nucleic acid sequence associated with or adjacent to a ubiquitously expressed gene. Preferably, the nucleic acid sequence comprises an extended methylation-free, CpG-island. It is further preferred that the nucleic acid sequence comprises at least one transcription factor binding site. Preferably, the nucleic acid sequence comprises dual or bi-directional promoters that are divergently transcribed. Preferably, the nucleic acid sequence comprises dual promoters that are divergently transcribed. Preferably, the nucleic acid sequence comprises dual or bi-directional promoters that are divergently transcribed and which are associated with ubiquitously expressed genes. Preferably, the nucleic acid sequence comprises dual promoters that are divergently transcribed and which are associated with ubiquitously expressed genes.
The present invention also provides the use of the polynucleotide of the present invention, or a fragment thereof, in an assay for identifying other UCOEs. Preferably, a fragment of the polynucleotide is used which encompasses a conserved sequence or structural motif. Methods for performing such an assay are well known to those skilled in the art.
The present invention provides a vector comprising the polynucleotide of the present invention. The vector preferably comprises an expressible gene operably-linked to the polynucleotide. The expressible gene comprises the necessary elements enabling gene expression such as suitable promoters, enhancers, splice acceptor sequences, internal ribosome entry site sequences (IRES) and transcription stop sites. Suitable elements for enabling gene expression are well known to those skilled in the art. The suitable elements for enabling gene expression can be the natural endogenous elements associated with the gene or may be heterologous elements used in order to obtain a different level or tissue distribution of gene expression compared to the endogenous gene. Preferably, the vector comprises a promoter operably associated with the expressible gene and the polynucleotide. The promoter may be a natural endogenous promoter of the expressible gene or may be a heterologous promoter. The heterologous promoter may be any promoter including tissue specific promoters such as tumour-specific promoters and ubiquitous promoters. Preferably the heterologous promoter is a strong and/or a substantially ubiquitous promoter and most preferably is the CMV promoter.
The vector may be any vector capable of transferring DNA to a cell. Preferably, the vector is an integrating vector or an episomal vector.
Preferred integrating vectors include recombinant retroviral vectors. A recombinant retroviral vector will include DNA of at least a portion of a retroviral genome which portion is capable of infecting the target cells. The term xe2x80x9cinfectionxe2x80x9d is used to mean the process by which a virus transfers genetic material to its host or target cell. Preferably, the retrovirus used in the construction of a vector of the invention is also rendered replication-defective to remove the effect of viral replication of the target cells. In such cases, the replication-defective viral genome can be packaged by a helper virus in accordance with conventional techniques. Generally, any retrovirus meeting the above criteria of infectiousness and capability of functional gene transfer can be employed in the practice of the invention.
Suitable retroviral vectors include but are not limited to pLJ, pZip, pWe and pEM, well known to those of skill in the art. Suitable packaging virus lines for replication-defective retroviruses include, for example, xcexa8Crip, xcexa8Cre, xcexa82 and xcexa8Am.
Other vectors useful in the present invention include adenovirus, adeno-associated virus, SV40 virus, vaccinia virus, HSV and pox virusvectors. A preferred vector is the adenovirus. Adenovirus vectors are well known to those skilled in the art and have been used to deliver genes to numerous cell types, including airway epithelium, skeletal muscle, liver, brain and skin (Hitt, MM, Addison C L and Graham, F L (1997) Human adenovirus vectors for gene transfer into mammalian cells. Advances in Pharmacology 40: 137-206; and Anderson W F (1998) Human gene therapy. Nature 392 (6679 Suppl): 25-30).
A further preferred vector is the adeno-associated (AAV) vector. AAV vectors are well known to those skilled in the art and have been used to stably transduce human T-lymphocytes, fibroblasts, nasal polyp, skeletal muscle, brain, erythroid and heamopoietic stem cells for gene therapy applications (Philip et al., 1994, Mol. Cell. Biol., 14, 2411-2418; Russell et al., 1994, PNAS USA, 91, 8915-8919; Flotte et al., 1993, PNAS USA, 90, 10613-10617; Walsh et al., 1994, PNAS USA, 89, 7257-7261; Miller et al., 1994, PNAS USA, 91, 10183-10187; Emerson, 1996, Blood, 87, 3082-3088). International Patent Application WO 91/18088 describes specific AAV based vectors.
Preferred episomal vectors include transient non-replicating episomal vectors and self-replicating episomal vectors with functions derived from viral origins of replication such as those from EBV, human papovavirus (BK) and BPV-1. Such integrating and episomal vectors are well known to those skilled in the art and are fully described in the body of literature well known to those skilled in the art. In particular, suitable episomal vectors are described in WO98/07876.
Mammalian artificial chromosomes are also preferred vectors for use in the present invention. The use of mammalian artificial chromosomes is discussed by Calos (1996, TIG, 12, 463-466).
In a preferred embodiment, the vector of the present invention is a plasmid. It is further preferred that the plasmid is a non-replicating, non-integrating plasmid.
The term xe2x80x9cplasmidxe2x80x9d as used herein refers to any nucleic acid encoding an expressible gene and includes linear or circular nucleic acids and double or single stranded nucleic acids. The nucleic acid can be DNA or RNA and may comprise modified nucleotides or ribonucleotides, and may be chemically modified by such means as methylation or the inclusion of protecting groups or cap- or tail structures.
A non-replicating, non-integrating plasmid is a nucleic acid which when transfected into a host cell does not replicate and does not specifically integrate into the host cell""s genome (i.e. does not integrate at high frequencies and does not integrate at specific sites).
Replicating plasmids can be identified using standard assays including the standard replication assay of Ustav et al., EMBO J., 10 449-457, 1991.
Preferably, a non-replicating, non-integrating plasmid is a plasmid that cannot be stably maintained in cells, independently of genomic DNA replication, and which does not persist in progeny cells for three or more cell divisions without a significant loss in copy number of the plasmid in the cells, i.e., with a loss of greater than an average of about 50% of the plasmid molecules in progeny cells between a given cell division. Generally, in self-replicating vectors, the self-replicating function is provided by using a viral origin of replication and providing one or more viral replication factors that are required for replication mediated by that particular viral origin. Self-replicating vectors are described in WO 98/07876. The term xe2x80x9ctransiently transfecting, non-integrating plasmidxe2x80x9d herein means the same as the term xe2x80x9cnon-replicating, non-integrating plasmidxe2x80x9d as defined above.
Preferably the plasmid is a naked nucleic acid. As used herein, the term xe2x80x9cnakedxe2x80x9d refers to a nucleic acid molecule that is free of direct physical associations with proteins, lipids, carbohydrates or proteoglycans, whether covalently or through hydrogen bonding. The term does not refer to the presence or absence of modified nucleotides or ribonucleotides, or chemical modification of the all or a portion of a nucleic acid molecule by such means as methylation or the inclusion of protecting groups or cap- or tail structures.
Preferably, the vector of the present invention comprises the sequence of FIG. 21 between nucleotides 1 and 7627 (encompassing both the hnRNP A2 and HP1H-xcex3 promoters), the CMV promoter, a multiple cloning site, a polyadenylation sequence and genes encoding selectable markers under suitable control elements. Preferably the vector of the present invention is the CET200 or the CET210 vector schematically shown in FIG. 49.
The present invention also provides a host cell transfected with the vector of the present invention. The host cell may be any cell such as yeast cells, insect cells, bacterial cells and mammalian cells. Preferably the host cell is a mammalian cell and may be derived from mammalian cell lines such as the CHO cell line, the 293 cell line and NSO cells.
Preferably, the operably-linked gene is a therapeutic nucleic acid sequence. Therapeutically useful nucleic acid sequences, which may be used in the present invention, include sequences encoding receptors, enzymes, ligands, regulatory factors, hormones, antibodies or antibody fragments and structural proteins. Therapeutic nucleic acid sequences also include sequences encoding nuclear proteins, cytoplasmic proteins, mitochondrial proteins, secreted proteins, membrane-associated proteins, serum proteins, viral antigens, bacterial antigens, protozoal antigens and parasitic antigens. Nucleic acid sequences useful according to the invention also include sequences encoding proteins, peptides, lipoproteins, glycoproteins, phosphoproteins and nucleic acid (e.g., RNAs or antisense nucleic acids).
Proteins or polypeptides which can be encoded by the therapeutic nucleic acid sequence include hormones, growth factors, enzymes, clotting factors, apolipoproteins, receptors, erythropoietin, therapeutic antibodies or fragments thereof, drugs, oncogenes, tumor antigens, tumor suppressors, viral antigens, parasitic antigens and bacterial antigens. Specific examples of these compounds include proinsulin, growth hormone, androgen receptors, insulin-like growth factor I, insulin-like growth factor II, insulin-like growth factor binding proteins, epidermal growth factor, transforming growth factor-xcex1, transforming growth factor-xcex2, platelet-derived growth factor, angiogenesis factors (acidic fibroblast growth factor, basic fibroblast growth factor, vascular endothelial growth factor and angiogenin), matrix proteins (Type IV collagen, Type VII collagen, laminin), phenylalanine hydroxylase, tyrosine hydroxylase, oncoproteins (for example, those encoded by ras, fos, myc, erb, src, neu, sis, jun), HPV E6 or E7 oncoproteins, p53 protein, Rb protein, cytokine receptors, IL-1, IL-6, IL-8, and proteins from viral, bacterial and parasitic organisms which can be used to induce an immunological response, and other proteins of useful significance in the body. The choice of gene, to be incorporated, is only limited by the availability of the nucleic acid sequence encoding it. One skilled in the art will readily recognise that as more proteins and polypeptides become identified they can be integrated into the polynucleotide of the present invention and expressed.
When the polynucleotide of the present invention is comprised in a plasmid, it is preferred that the plasmid be used in monogenic gene therapy such as in the treatment of Duchenne muscular dystrophy and in DNA vaccination and immunisation methods.
The polynucleotide of the invention also may be used to express genes that are already expressed in a host cell (i.e., a native or homologous gene), for example, to increase the dosage of the gene product. It should be noted, however, that expression of a homologous gene might result in deregulated expression, which may not be subject to control by the UCOE due to its over-expression in the cell.
The polynucleotide of the invention may be inserted into the genome of a cell in a position operably associated with an endogenous (native) gene and thereby lead to increased expression of the endogenous gene. Methods for inserting elements into the genome at specific sites are well known to those skilled in the art and are described in U.S. Pat. No. 5,578,461 and U.S. Pat. No. 5,641,670. Alternatively, the polynucleotide of the present invention in its endogenous (native) position on the genome may have a gene inserted in an operably associated position so that expression of the gene occurs. Again, methods for inserting genes into the genome at specific sites are well known to those skilled in the art and are described in U.S. Pat. No. 5,578,461 and U.S. Pat. No. 5,641,670.
The present invention provides the use of the polynucleotide of the present invention to increase the expression of an endogenous gene comprising inserting the polynucleotide into the genome of a cell in a position operably associated with the endogenous gene thereby increasing the level of expression of the gene.
Numerous techniques are known and are useful according to the invention for delivering the vectors described herein to cells, including the use of nucleic acid condensing agents, electroporation, complexation with asbestos, polybrene, DEAE cellulose, Dextran, liposomes, cationic liposomes, lipopolyamines, polyornithine, particle bombardment and direct microinjection (reviewed by Kucherlapati and Skoultchi, Crit. Rev. Biochem. 16:349-379 (1984); Keown et al., Methods Enzymol. 185:527 (1990)).
A vector of the invention may be delivered to a host cell non-specifically or specifically (i.e., to a designated subset of host cells) via a viral or non-viral means of delivery. Preferred delivery methods of viral origin include viral particle-producing packaging cell lines as transfection recipients for the vector of the present invention into which viral packaging signals have been engineered, such as those of adenovirus, herpes viruses and papovaviruses. Preferred non-viral based gene delivery means and methods may also be used in the invention and include direct naked nucleic acid injection, nucleic acid condensing peptides and non-peptides, cationic liposomes and encapsulation in liposomes.
The direct delivery of vector into tissue has been described and some short term gene expression has been achieved. Direct delivery of vector into muscle (Wolff et al., Science, 247, 1465-71468, 1990) thyroid (Sykes et al., Human Gene Ther., 5, 837-844, 1994) melanoma (Vile et al., Cancer Res., 53, 962-967, 1993), skin (Hengge et al., Nature Genet, 10, 161-166, 1995), liver (Hickman et al., Human Gene Therapy, 5, 1477-1483, 1994) and after exposure of airway epithelium (Meyer et al., Gene Therapy, 2, 450-460, 1995) is clearly described in the prior art.
Various peptides derived from the amino acid sequences of viral envelope proteins have been used in gene transfer when co-administered with polylysine DNA complexes (Plank et al., J. Biol. Chem. 269:12918-12924 (1994));. Trubetskoy et al., Bioconjugate Chem. 3:323-327 (1992); WO 91/17773; WO 92/19287; and Mack et al., Am. J. Med. Sci. 307:138-143 (1994)) suggest that co-condensation of polylysine conjugates with cationic lipids can lead to improvement in gene transfer efficiency. International Patent Application WO 95/02698 discloses the use of viral components to attempt to increase the efficiency of cationic lipid gene transfer.
Nucleic acid condensing agents useful in the invention include spermine, spermine derivatives, histones, cationic peptides, cationic non-peptides such as polyethyleneimine (PEI) and polylysine. Spermine derivatives refers to analogues and derivatives of spermine and include compounds as set forth in International Patent Application. WO 93/18759 (published Sep. 30, 1993).
Disulphide bonds have been used to link the peptidic components of a delivery vehicle (Cotten et al., Meth. Enzymol. 217:618-644 (1992)); see also, Trubetskoy et al. (supra).
Delivery vehicles for delivery of DNA constructs to cells are known in the art and include DNA/poly-cation complexes which are specific for a cell surface receptor, as described in, for example, Wu and Wu, J. Biol. Chem. 263:14621 (1988); Wilson et al., J. Biol. Chem. 267:963-967 (1992); and U.S. Pat. No. 5,166,320).
Delivery of a vector according to the invention is contemplated using nucleic acid condensing peptides. Nucleic acid condensing peptides, which are particularly useful for condensing the vector and delivering the vector to a cell, are described in WO 96/41606. Functional groups may be bound to peptides useful for delivery of a vector according to the invention, as described in WO 96/41606. These functional groups may include a ligand that targets a specific cell-type such as a monoclonal antibody, insulin, transferrin, asialoglycoprotein, or a sugar. The ligand thus may target cells in a non-specific manner or in a specific manner that is restricted with respect to cell type.
The functional groups also may comprise a lipid, such as palmitoyl, oleyl, or stearoyl; a neutral hydrophilic polymer such as polyethylene glycol (PEG), or polyvinylpyrrolidine (PVP); a fusogenic peptide such as the HA peptide of influenza virus; or a recombinase or an integrase. The functional group also may comprise an intracellular trafficking protein such as a nuclear localisation sequence (NLS) and endosome escape signal or a signal directing a protein directly to the cytoplasm.
The present invention also provides the polynucleotide, vector or host cell of the present invention for use in therapy.
Preferably, the polynucleotide, vector or host cell is used in gene therapy.
The present invention also provides the use of the polynucleotide, vector or host cell of the present invention in the manufacture of a composition for use in gene therapy.
The present invention also provides a method of treatment, comprising administering to a patient in need of such treatment an effective dose of the polynucleotide, vector or host cell of the present invention. Preferably, the patient is suffering from a disease treatable by gene therapy.
The present invention also provides a pharmaceutical composition comprising the polynucleotide, vector or host cell of the present invention in combination with a pharmaceutically acceptable recipient.
The present invention also provides use of a polynucleotide, vector or host cell of the present invention in a cell culture system in order to obtain the desired gene product. Suitable cell culture systems are well known to those skilled in the art and are fully described in the body of literature known to those skilled in the art.
The present invention also provides the use of the polynucleotide of the present invention in producing transgenic plant genetics. The generation of transgenic plants which have increased yield, resistance, etc. are well known to those skilled in the art. The present invention also provides a transgenic plant containing cells which contain the polynucleotide of the present invention.
The present invention also provides a transgenic non-human animal containing cells, which contain the polynucleotide of the present invention.
The pharmaceutical compositions of the present invention may comprise the polynucleotide, vector or host cell of the present invention, if desired, in admixture with a pharmaceutically acceptable carrier or diluent, for therapy to treat a disease or provide the cells of a particular tissue with an advantageous protein or function.
The polynucleotide, vector or host cell of the invention or the pharmaceutical composition may be administered via a route which includes systemic intramuscular, intravenous, aerosol, oral (solid or liquid form), topical, ocular, as a suppository, intraperitoneal and/or intrathecal and local direct injection.
The exact dosage regime will, of course, need to be determined by individual clinicians for individual patients and this, in turn, will be controlled by the exact nature of the protein expressed by the gene of interest and the type of tissue that is being targeted for treatment.
The dosage also will depend upon the disease indication and the route of administration. Advantageously, the duration of treatment will generally be continuous or until the cells die. The number of doses will depend upon the disease, and efficacy data from clinical trials.
The amount of polynucleotide or vector DNA delivered for effective gene therapy according to the invention will preferably be in the range of between about 50 ng-1000 xcexcg of vector DNA/kg body weight; and more preferably in the range of between about 1-100 xcexcg vector DNA/kg.
Although it is preferred according to the invention to administer the polynucleotide, vector or host cell, to a mammal for in vivo cell uptake, an ex vivo approach may be utilised whereby cells are removed from an animal, transduced with the polynucleotide or vector, and then re-implanted into the animal. The liver, for example, can be accessed by an ex vivo approach by removing hepatocytes from an animal, transducing the hepatocytes in vitro and re-implanting the transduced hepatocytes into the animal (e.g., as described for rabbits by Chowdhury et al., Science 254:1802-1805, 1991, or in humans by Wilson, Hum. Gene Ther. 3:179-222, 1992). Such methods also may be effective for delivery to various populations of cells in the circulatory or lymphatic systems, such as erythrocytes, T cells, B cells and haematopoietic stem cells.
In another embodiment of the invention, there is provided a mammalian model for determining the tissue-specificity and/or efficacy of gene therapy using the polynucleotide, vector or host cell of the invention. The mammalian model comprises a transgenic animal whose cells contain the vector of the present invention. Methods of making transgenic mice (Gordon et al., Proc. Natl. Acad. Sci. USA 77:7380 (1980); Harbers et al., Nature 293:540 (1981); Wagner et al., Proc. Natl. Acad Sci. USA 78:5016 (1981); and Wagner et al., Proc. Natl. Acad. Sci. USA 78:6376 (1981), sheep, pigs, chickens (see Hammer et al., Nature 315:680 (1985)), etc., are well-known in the art and are contemplated for use according to the invention. Such animals permit testing prior to clinical trials in humans.
Transgenic animals containing the polynucleotide of the invention also may be used for long-term production of a protein of interest.
The present invention also relates to the use of the polynucleotide of the present invention in functional genomics applications. Functional genomics relates principally to the sequencing of genes specifically expressed in particular cell types or disease states and now provides thousands of novel gene sequences of potential interest for drug discovery or gene therapy purposes. The major problem in using this information for the development of novel therapies lies in how to determine the functions of these genes. UCOEs can be used in a number of functional genomic applications in order to determine the function of gene sequences. The functional genomic applications of the present invention include, but are not limited to:
(1) Using the polynucleotide of the present invention to achieve sustained expression of anti-sense versions of the gene sequences or ribozyme knockdown libaries, thereby determining the effects of inactivating the gene on cell phenotype.
(2) Using the polynucleotide of the present invention to prepare expression libraries for the gene sequences, such that delivery into cells will result in reliable, reproducible, sustained expression of the gene sequences. The resulting cells, expressing the gene sequences can be used in a variety of approaches to function determination and drug discovery. For example, raising antibodies to the gene product for neutralisation of its activity; rapid purification of the protein product of the gene itself for use in structural, functional or drug screening studies; or in cell-based drug screening.
(3) Using the polynucleotide of the present invention in approaches involving mouse embryonic stem (ES) cells and transgenic mice. One of the most powerful functional genomics approaches involves random insertion into genes in mouse ES cells of constructs which only allow drug selection following insertion into expressed genes, and which can readily be rescued for sequencing (G.Hicks et al., 1997 Nature Genetics, 16, 338-344). Transgenic mice with knockout mutations in genes with novel sequences can then readily be made to probe their function. At present this technology works well for the 10% of mouse genes which are well expressed in mouse ES cells. Incorporation of UCOEs into the integrating constructs will enable this technique to be extended to identify all genes expressed in mice.
The following examples, with reference to the figures, are offered by way of illustration and are not intended to limit the invention in any manner. The preparation, testing and analysis of several representative polynucleotides of the invention are described in detail below. One of skill in the art may adapt these procedures for preparation and testing of other polynucleotides of the invention.