Activation of transcription of a eukaryotic gene involves the interaction of a variety of proteins to form a complex that is recruited to the gene through protein:DNA interactions. Key protein domains on one or more of the components include transcription activation domains and DNA binding domains. Elucidating the mechanism of transcription, identifying and characterizing components of the transcriptional machinery and in some cases harnessing some of those components have been the subject of extensive research. (See, e.g., Brent and Ptashne, 1985; Hope and Struhl, 1986; Keegan et al. 1986., Fields and Song, 1989; Spencer et al, 1993, Belshaw et al, 1996 and Rivera et al, 1996)(A Bibliography is provided just prior to the Examples, below.)
Transcription activation domains are thought to function by recruiting a number of proteins with specific functions to the promoter (Lin and Green, 1991; Goodrich et al, 1993; Orphanides et al. 1996 and references cited therein; Ptashne and Gann, 1997 and references cited therein). Among the large number of activation domains that have been characterized to date, the acidic-activation domain of the Herpes Simplex virus encoded protein, VP16, is considered to be a very strong inducer of transcription and is widely used in biological research (Sadowski et al, 1988, Ptashne and Gann, 1997). The transcription activation domain of the p65 subunit of the human transcription factor NF-kB is also a very potent stimulator of gene expression, and in certain contexts can induce transcription more strongly than VP16 (Schmitz and Baeuete, 1991; Ballard et al, 1992; Moore at al, 1993, Blair et al, 1994; Natesan et al, 1997). Both the VP16 and p65 activation domains are thought to function by interacting with and recruiting a number of proteins to the promoter (Cress and Triezenberg, 1990; Scmitz at al, 1994; Uesugi et at, 1997).
One of the remarkable features of such activation domains is that xe2x80x9cfusingxe2x80x9d them to heterologous protein domains seldom affects their ability to activate transcription when recruited to a wide variety of promoters. The high degree of functional independence exhibited by these activation domains makes them valuable tools in various biological assays for analyzing gene expression and protein-protein or protein-RNA or protein-small molecule drug interactions (Fields and Song, 1989; Senguptha et al, 1996; Rivera et al, 1996; Triezenberg, 1995 and references cited therein). The ability to activate gene expression strongly and when recruited to a wide range of promoters makes both p65 and VP16 attractive candidates for activation of gene transcription in gene therapy and other applications. However, even more potent activation domains, if available, would be useful for achieving higher levels of transcription on a per cell basis, and for improving the efficiency of the many biological assays that rely upon activation of transcription of a reporter gene.
Several strategies to improve the potency of activation domains and thereby the expression of genes under their control have been reported (Emami and Carey, 1992; Gerber at al, 1994; Ohashi et al, 1994; Blair at al, 1996; Tanaka et al, 1996). These approaches generally involve increasing the number of copies of activation domains fused to the DNA binding domain or generating activators containing synergizing combinations of activation domains. Although some activators generated by these methods have been shown to be more potent, a number of limitations preclude their widespread application. First, potent activators comprising reiterated activation domains do not increase the absolute levels of reporter gene expression when tested on promoters with multiple binding sites for the activator (Emami and Carey, 1992). Second, a number of synergistic combinations of activation domains reported in the literature involve weak activation domains and the absolute levels of gene expression induced by these synergizing activation domains are much lower compared to potent acidic activation domains from VP16 or p65 (Gerber at al, 1994; Tanaka et al, 1996). Third, it is not known whether any of these potent activation domains are capable of inducing gene transcription strongly when they are non-covalently linked to the DNA binding domain. Fourth, many potent activators containing multiple copies of VP16 or other acidic activators are highly toxic and/or accumulate to only low levels in the cell.
As mentioned at the outset, a variety of important applications involving gene transcription require or would benefit from higher levels of gene expression. As noted above, however, efforts to improve the potency of activation domains have been disappointing. Moreover, expression of various transcription activators revealed that observed levels of more potent activators, such as the p65 unit of NF-kB, are lower than expected. Without wishing to be bound by any one theory, we suggest that the more potent the activation domain, the more toxic it is to the cell, the more disfavored is its expression and/or the less of it is observed to accumulate in cells. How, then, is it possible to increase levels of heterologous gene expression? Remarkably, we have found that it is still possible to outmaneuver these facts of nature to improve heterologous gene expression and have in fact done so using the principles of xe2x80x9cbundlingxe2x80x9d, the engineering of the transcription activation domain, and combinations thereof, as described below.
This document discloses new improvements in the design and delivery of transcription activation domains and provides improved materials and methods for regulating the transcription of a target gene. Aspects of the invention are applicable to systems involving either covalent or non-covalent linking of the transcription activation domain to a DNA binding domain.
Key features of the invention include xe2x80x9cbundlingxe2x80x9d domains, fusion proteins containing them, recombinant nucleic acids encoding such fusion proteins, systems involving bundles of such fusion proteins, and other materials and methods involving such bundling domains. Key fusion proteins of the invention contain at least two mutually heterologous domains, one of which being a bundling domain. An important design concept is that the fusion proteins do not need to act alone. Instead, they find and bind to each other (or with other proteins containing the bundling domain) to form a posse to accomplish their mission. In practice, cells are engineered by the introduction of recombinant nucleic acids encoding the fusion proteins, and in some cases with additional nucleic acid constructs, to render them capable of ligand-dependent regulation of transcription of a target gene. Administration of the ligand to the cells then regulates (positively, or in some cases, negatively) target gene transcription.
Detailed information concerning bundling domains, guidance on their use and illustrative examples are provided below. Generally speaking, bundling domains include any domain that induces proteins that contain it to form multimers (xe2x80x9cbundlesxe2x80x9d) through protein-protein interactions with each other or with other proteins containing the bundling domain. Examples of bundling domains that can be used in the practice of this invention include domains such as the lac repressor tetramerization domain, the p53 tetramerization domain, a leucine zipper domain, and domains derived therefrom which retain observable bundling activity. Proteins containing a bundling domain are capable of complexing with one another to form a bundle of the individual protein molecules. Such bundling is xe2x80x9cconstitutivexe2x80x9d in the sense that it does not require the presence of a cross-linking agent (i.e., a cross-linking agent which doesn""t itself contain a proteinaceous bundling domain) to link the protein molecules.
Illustrative (non-limiting) examples of heterologous domains which can be included along with a bundling domain in various fusion proteins of this invention include transcription regulatory domains (i.e., transcription activation domains such as a p65, VP16 or AP domain; transcription potentiating or synergizing domains; or transcription repression domains such as an ssn-6/TUP-1 domain or Krxc3xcppel family suppressor domain); a DNA binding domain such as a GAL4, lex A or a composite DNA binding domain such as a composite zinc finger domain or a ZFHD1 domain; or a ligand-binding domain comprising or derived from (a) an immunophilin, cyclophilin or FRB domain; (b) an antibiotic binding domain such as tetR: or (c) a hormone receptor such as a progesterone receptor or ecdysone receptor.
A wide variety of ligand binding domains may be used in this invention, although ligand binding domains which bind to a cell permeant ligand are preferred. It is also preferred that the ligand have a molecular weight under about 5 kD, more preferably below 2.5 kD and optimally below about 1500 D. Non-proteinaceous ligands are also preferred. Ligand binding domains include, for example, domains selected or derived from (a) an immunophilin (e.g. FKBP 12), cyclophilin or FRAP domain; (b) a hormone receptor such as a receptor for progesterone, ecdysone or another steroid; and (c) an antibiotic receptor such as a tetR domain for binding to tetracycline, doxycycline or other analogs or mimics thereof.
Examples of ligand binding domain/ligand pairs that may be used in the practice of this invention include, but are not limited to: FKBP:FK1012, FKBP:synthetic divalent FKBP ligands (see WO 96/0609 and WO 97/31898), FRB:rapamycin/FKBP (see e.g., WO 96/41865 and Rivera et al, xe2x80x9cA humanized system for pharmacologic control of gene expressionxe2x80x9d, Nature Medicine 2(9):1028-1032 (1997)), cyclophilin:cyclosporin (see e.g. WO 94/18317), DHFR:methotrexate (see e.g. Licitra et al, 1996, Proc. Natl. Acad. Sci. USA 93:12817-12821), TetR:tetracycline or doxycydine or other analogs or mimics thereof (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. U.S.A. 89:5547; Gossen et al, 1995, Science 268:1766-1769; Kistner et al, 1996, Proc. Natl. Acad. Sci. USA 93:10933-10938), a progesterone receptor:RU486 (Wang et al, 1994, Proc. Natl. Acad. Sci. USA 91:8180-8184), eodysone receptor ecdysone or muristerone A or other analogs or mimics thereof (No et al, 1996, Proc. Natl. Acad. Sci. USA 93:3346-3351) and DNA gyrase:coumermycin (see e.g. Farrar et al, 1996, Nature 383:178-181).
A wide variety of DNA binding domains may be used in the practice of this invention, including a domain selected or derived from a GAL4, lexA or composite (e.g. ZFHD1) DNA binding domain, or a DNA binding domain, e.g., in combination with ligand binding domains such as a wt or mutated progesterone receptor domain. TetR domains, which provide both DNA binding and ligand binding functions, are discussed in the context of ligand binding domains. In many applications it is preferable to use a DNA binding domain which is heterologous to the cells to be engineered. Heterologous DNA binding domains include those which occur naturally in cell types other than the cells to be engineered as well as composite DNA binding domains containing component portions which are not found in the same continuous polypeptide or gene in nature, at least not in the same order or orientation or with the same spacing present in the composite domain. In the case of composite DNA binding domains, component peptide portions which are endogenous to the cells or organism to be engineered are generally preferred.
In the case of the chimeric transcription factors containing a tetR domain, the DNA binding domain is provided by the tetR component, and is by its nature heterologous to eukaryotic cells. TetR domains are discussed in further detail in the context of ligand binding domains.
In embodiments in which an endogenous gene is to be regulatably expressed, a composite DNA binding domain which is selected for recognition of one or more sequences upstream of the target gene may be deployed.
Additional information concerning DNA binding domains is provided below.
In an important application of this invention, two or more of the fusion proteins in the bundle each comprise, in addition to the bundling domain, at least one transcription activation domain which is heterologous to the bundling domain. Bundling of proteins containing transcription activation domains can significantly increase their effective potency (relative to a single such fusion protein lacking a bundling domain) and consequently leads to strong induction of gene expression. Unlike their counterparts lacking a bundling domain, fusion proteins containing a bundling domain are designed to achieve effective local concentrations of transcription activation domains and to robustly induce gene expression when recruited en masse to an expression control sequencexe2x80x94even despite relatively low overall levels of expression or accumulation of the fusion proteins. Highly potent bundled activation domains can also be used in a wide variety of assays having transcriptional read outs. Such assays include assays for identifying protein-protein interactions (or inhibitors thereof) in a eukaryotic, preferably mammalian, two-hybrid assay or variant thereof, e.g., three-hybrid assay, reverse two-hybrid assay, etc.
Bundling domains may be introduced into the design of fusion proteins of a variety of regulated gene expression systems, including both allostery-based systems such as those regulated by tetracycline, RU486 or eodysone, or analogs or mimics thereof, and dimerizaion-based systems such as those regulated by divalent compounds like FK1012, FKCsA, rapamycin, AP1510 or coumermycin, or analogs or mimics thereof, all as described below (See also, Clackson, 1997, Controlling mammalian gene expression with small molecules, Current Opinion in Chem. Biol. 1:210-218). The fusion proteins may comprise any combination of relevant components, including bundling domains, DNA binding domains, transcription activation (or repression) domains and ligand binding domains. Other heterologous domains may also be included.
Various embodiments of this invention involve fusion proteins which contain at least one bundling domain, DNA binding domain and transcription activation domain; at least one bundling domain, ligand binding domain and transcription repression domain; at least one bundling domain, ligand binding domain and DNA binding domain; at least one bundling domain, ligand binding domain, DNA binding domain and transcription activation domain; and, preferably, at least one bundling domain, ligand binding domain and transcription activation domain. In currently preferred embodiments, these fusion proteins represent improvements on the type described in WO94/18317 and WO96/41865, wherein the ligand binding domain is or is derived from a cyclophilin, immunophilin (e.g. an FKBP domain) or FRB domain-although, any ligand binding domain may be used in the chimeric proteins, and the regulatory mechanism can be dimerization- or allostery-based.
A preferred fusion protein contains a lac repressor tetramerization domain, an FRB domain and a transcription activation domain derived from the activation domain of human p65. It should be appreciated that in any of the embodiments of this invention involving a fusion protein containing at least one transcription activation domain derived from p65, whether with or without a bundling domain, the p65 peptide sequence may be a naturally occurring p65 sequence or may be engineered as described below.
Another aspect of this invention involves improvements in the transcription activation domain itself. In this regard, recombinant nucleic acids are provided which encode fusion proteins containing a transcription activation domain and at least one additional domain that is heterologous thereto, where the peptide sequence of the activation domain is itself modified relative to the naturally occurring sequence from which it was derived to increase or decrease its potency as a transcriptional activator relative to the counterpart comprising the native peptide sequence. Certain embodiments of this invention involve fusion proteins containing a transcription activation domain derived from p65 and bearing one or more of the mutations shown in FIG. 7. Fusion proteins containing one or more modified activation domains can also contain a bundling domain to further increase their efficacy as transcriptional activators, and/or one or more additional domains such as a ligand binding domain, DNA binding domain or transcription activation synergizing domain, such as are noted above and as discussed below.
The invention thus provides recombinant nucleic acid constructs which encode the various proteins of this invention or are otherwise useful for practicing it, various DNA vectors containing those constructs for use in transducing prokaryotic and eukaryotic cells, cells transduced with the recombinant nucleic acids, fusion proteins encoded by the above recombinant nucleic acids, and target gene constructs.
Also provided are nucleic acid compositions comprising two or more recombinant nucleic acids which, when present within a cell, permit transcription of a target gene, preferably following exposure to a cell permeant ligand. These compositions are illustrated as follows:
Composition #1. A first such composition comprises a recombinant nucleic acid encoding a fusion protein comprising at least one ligand binding domain, bundling domain and transcription activation domain; a second recombinant nucleic acid encoding a fusion protein comprising a DNA binding domain and at least one ligand binding domain; and an optional third recombinant nucleic acid comprising a target gene (or cloning site) operatively linked to an expression control sequence including a DNA sequence recognized by the DNA binding domain mentioned above. Such compositions are illustrated by embodiments in which the ligand binding domains are or are derived from immunophilin, cyclophilin or FRB domains; the transcription activation domain is or is derived from an activation domain such as a VP16 or p65 domain; and the bundling domain is or is derived from a lac repressor tetramerization domain.
Composition #2. Another such composition is similar to Composition #1 except that the fusion protein encoded by the first recombinant nucleic acid comprises at least one ligand binding domain, bundling domain and DNA binding domain, and the fusion protein encoded by the second recombinant nucleic acid comprises a transcription activation domain and at least one ligand binding domain.
Composition #3. Another such composition comprises a recombinant nucleic acid encoding a fusion protein comprising at least one ligand binding domain, bundling domain and transcription activation domain; a second recombinant nucleic acid encoding a protein comprising a DNA binding domain; and an optional third recombinant nucleic add comprising a target gene (or cloning site) operatively linked to an expression control sequence including a DNA sequence recognized by the DNA binding domain mentioned above. Such compositions are illustrated by embodiments in which the ligand binding domains are or are derived from a receptor domain such as an ecdysone receptor; the DNA binding domain is or is derived from a DNA binding domain such as an RXR protein, chosen for its ability to bind to the receptor domain in the presence of a ligand for that receptor; the transcription activation domain is or is derived from an activation domain such as a VP16 or p65 domain; and the bundling domain is or is derived from a lac repressor tetramerization domain.
Composition #4. Another such composition comprises a recombinant nucleic acid encoding a fusion protein comprising at least one ligand binding domain, DNA binding domain, bundling domain and transcription activation domain (where the ligand binding domain and DNA binding domain may be part of or derived from the same domain); and an optional second recombinant nucleic acid comprising a target gene (or cloning site) operatively linked to an expression control sequence including a DNA sequence recognized by the DNA binding domain mentioned above. Such compositions are illustrated by embodiments in which the ligand binding and DNA binding domains are or are derived from a receptor domain such as a tetracycline receptor which is capable of binding to a characteristic DNA sequence in the presence of tetracycline or another ligand for the receptor; the transcription activation domain is or is derived from an activation domain such as a VP16 or p65 domain; and the bundling domain is or is derived from a lac repressor tetramerization domain. Such compositions are further illustrated by embodiments in which the ligand binding domain is or is derived from a receptor domain such as a progesterone receptor which is capable of binding to progesterone or analogs or mimics thereof, including RU486; the DNA binding domain is or is derived from a GAL4 or composite DNA binding domain; the transcription activation domain is or is derived from an activation domain such as a VP16 or p65 domain; and the bundling domain is or is derived from a lac repressor tetramerization domain.
Composition #5. Another such composition, which unlike Compositions 1-4 is designed for constitutive expression rather than for ligand-mediated regulation of transcription, comprises a recombinant nucleic acid encoding a fusion protein comprising at least one DNA binding domain, bundling domain and transcription activation domain; and a second recombinant nucleic acid comprising a target gene (or cloning site) operatively linked to an expression control sequence including a DNA sequence recognized by the DNA binding domain mentioned above. Such compositions are illustrated by embodiments in which the transcription activation domain is or is derived from an activation domain such as a VP16 or p65 domain; the DNA binding domain is or is derived from a GAL4 or composite DNA binding domain; and the bundling domain is or is derived from a lac repressor tetramerization domain.
Compositions 1, 3, 4 and 5 may further comprise an additional recombinant nucleic acid encoding a fusion protein comprising a bundling domain and at least one transcription activation domain or transcription synergizing domain, with or without one or more optional additional domains.
Each of the recombinant nucleic acids of this invention may further comprise an expression control sequence operably linked to the coding sequence and may be provided within a DNA vector, e.g., for use in transducing prokaryotic or eukaryotic cells. Some or all of the recombinant nucleic acids of a given composition as described above, including any optional recombinant nucleic acids, may be present within a single vector or may be apportioned between two or more vectors. In certain embodiments, the vector or vectors are viral vectors useful for producing recombinant viruses containing one or more of the recombinant nucleic acids. The recombinant nucleic acids may be provided as inserts within one or more recombinant viruses which may be used, for example, to transduce cells in vitro or cells present within an organism, including a human or non-human mammalian subject. For example, the recombinant nucleic acids of any of Compositions 1-5, including any optional recombinant nucleic acids, may be present within a single recombinant virus or within a set of recombinant viruses, each of which containing one or more of the set of recombinant nucleic acids. Viruses useful for such embodiments include any virus useful for gene transfer, including adenoviruses, adeno-associated viruses (AAV), retroviruses, hybrid adenovirus-AAV, herpes viruses, lenti viruses, etc. In specific embodiments, the recombinant nucleic acid comprising the target gene is present in a first virus and one or more or the recombinant nucleic acids encoding the transcription regulatory protein(s) are present in one or more additional viruses. In such multiviral embodiments, a recombinant nucleic add encoding a fusion protein comprising a bundling domain and a transcription activation domain, and optionally, a ligand binding domain, may be provided in the same recombinant virus as the target gene construct, or alternatively, on a third virus. It should be appreciated that non-viral approaches (naked DNA, liposomes or other lipid compositions, etc.) may be used to deliver recombinant nucleic acids of this invention to cells in a recipient organism.
The invention also provides methods for rendering a cell capable of regulated expression of a target gene which involves introducing into the cell one or more of the recombinant nucleic acids of this invention to yield engineered cells which can express the appropriate fusion protein(s) of this invention to regulate transcription of a target gene. The recombinant nucleic acid(s) may be introduced in viral or other form into cells maintained in vitro or into cells present within an organism. The resultant engineered cells and their progeny containing one or more of these recombinant nucleic acids or nucleic acid compositions of this invention may be used in: a variety of important applications discussed elsewhere, including human gene therapy, analogous veterinary applications, the creation of cellular or animal models (including transgenic applications) and assay applications. Such cells are useful, for example, in methods involving the addition of a ligand, preferably a cell permeant ligand, to the cells (or administration of the ligand to an organism containing the cells) to regulate expression of a target gene. Particularly important animal models include rodent (especially mouse and rat) and non-human primate models. In gene therapy applications, the cells will generally be human and the peptide sequence of each of the various domains present in the fusion proteins (with the possible exception of the bundling domain) will preferably be, or be derived from, a peptide sequence of human origin.
In certain assay applications, recombinant nucleic acids are designed as described for Composition #1, except that the ligand binding domains of the fusion proteins are replaced with protein domains that are known to bind to each other. Cells transduced with these recombinant nucleic acids and with a matched target gene construct express a target gene typically selected for convenience of measurement of expression level. These cells can be used to identify the presence of a substance which blocks the interaction of the two protein domains which are known to interact.
In other 2-hybrid-type applications aimed at the identification of genes encoding proteins which interact with a protein or protein domain of interest, cells are transduced with similar recombinant nucleic acids as described immediately above, except that a library of test nucleic acid sequences of potential interest is cloned into one of the recombinant nucleic acids encoding one of the fusion proteins. A 2-hybrid style assay is conducted in which transcription of the target gene indicates the presence of a test nucleic acid sequence which encodes a domain that interacts with the protein domain in the cognate fusion protein.
Reverse 2-hybrid-type assays may be conducted analogously using cells engineered to positively or negatively regulate expression of a reporter gene as a result of xe2x80x9c2-hybridxe2x80x9d formation. The cells are exposed to one or more test substances, and inhibition of regulation of expression is taken as an indication of possible inhibition of the 2-hybrid formation.