The use of recombinant DNA technology has allowed the engineering of host cells to produce desired compounds, such as polypeptides and secondary metabolites. The large scale production of polypeptides in engineered cells allows for the production of proteins with pharmaceutical uses and enzymes with industrial uses. Secondary metabolites are products derived from nature that have long been known for their biological and medicinal importance. Because of the structural complexity inherent in such molecules, traditional chemical synthesis often requires extensive effort and the use of expensive precursors and cofactors to prepare the compound. In recent years, the expression of heterologous proteins in cells has facilitated the engineering of heterologous biosynthetic pathways in microorganisms to produce metabolites from inexpensive starting materials. In this manner, a variety of compounds have been produced, including polyketides, xcex2-lactam antibiotics, monoterpenes, steroids, and aromatics.
The invention is based, in part, on the discovery that production of heterologous polypeptides and metabolites can be enhanced by the regulated expression of the polypeptide (e.g., a biosynthetic enzyme) using a promoter which is regulated by the concentrations of a second metabolite, e.g. acetyl phosphate. The term xe2x80x9cheterologousxe2x80x9d refers to a polypeptide or metabolite which is introduced by artifice. A heterologous polypeptide or metabolite can be identical to endogenous entity that is naturally present. The term xe2x80x9cmetabolitexe2x80x9d refers to a organic compound which is the product of one or more biochemical reactions A metabolite may itself be a precursor for other reactions. A secondary metabolite is a metabolite derived from another.
Accordingly, in one aspect, the invention features a bacterial host cell containing a nucleic acid sequence comprising a promoter and a nucleic acid sequence encoding a heterologous polypeptide. Examples of bacterial host cells include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, Agrobacterium tumefaciens, Thermus thermophilus, and Rhizobium leguminosarum cells. The nucleic acid sequence is operably linked to the promoter which is controlled by a response regulator protein. In other words, the nucleic acid sequence is linked to the promoter sequence in a manner which allows for expression of the nucleotide sequence in vitro and in vivo. xe2x80x9cPromoterxe2x80x9d refers to any DNA fragment which directs transcription of genetic material. The promoter is controlled by a response regulator protein, for example, ntrC, phoB, phoP, ompR, cheY, creB, or torR, of E. coli or its homologs from other bacterial species. Further, the response regulator protein can be another member of the cluster orthologous group (COG) COG0745 as defined by http//www.ncbi.nlm.nih.gov/COG/(Tatusov et al. Nucleic Acids Res. (2000); 28:33-36). In one implementation, the promoter is bound by E. coli ntrC. The term xe2x80x9cntrCxe2x80x9d refers to both the E. coli ntrC protein (SWISSPROT: P06713, http://www.expasy.ch/) and its homologs in other bacteria as appropriate. As used herein, xe2x80x9cboundxe2x80x9d refers to a physical association with a equilibrium binding constant (KD) of less than 100 nM, preferably less than 1 nM. An example of the promoter is the E. coli glnAp2 promoter, e.g. a region between positions about 93 and about 323 in the published DNA sequence, GenBank accession no. M10421(Reitzer and Magasanik (1985) Proc Nat Acad Sci USA 82:1979-1983). This region includes untranslated sequences from the glnA gene. Further, a translational fusion can be constructed between coding sequences for glnA and coding sequences for the heterologous polypeptide.
The host cell is genetically modified such that the promoter is regulated by acetyl phosphate in the absence of nitrogen starvation. For example, the host cell can genetically modified by deletion or mutation of a gene encoding a histidine protein kinase, e.g., a member of COG0642 as defined by (http://www.ncbi.nlm.nih.gov/COG/; Tatusov et al. supra.), e.g., glnL, phoR, phoQ, creC, or envZ. In another example, the histidine protein kinase has specificity for the response regulator protein which controls the promoter. The histidine protein kinase can be encoded by glnL, e.g., E coli glnL (SWISSPROT P06712; http://www.expasy.ch/).
Whereas the host cell is genetically modified such that the promoter is regulated by acetyl phosphate in the absence of nitrogen starvation, for heterologous polypeptide or metabolite expression, the host cell can be propagated in any desired condition, e.g., in nitrogen starvation conditions, nitrogen poor conditions, or nitrogen rich conditions.
The heterologous polypeptide encoded by the nucleic acid sequence can be a biosynthetic enzyme required for production of a metabolite. It can be a mammalian protein, e.g., a secreted growth factor, a monoclonal antibody, or an extracellular matrix component. In yet another example, the heterologous polypeptide can be a desired antigen for use in a vaccine, e.g., a surface protein from a viral, bacterial, fungal, or protist pathogen.
Another aspect of the invention features a kit containing a nucleic acid sequence which includes a promoter controlled by a response regulator protein. The kit further optionally contains a bacterial host cell which is genetically modified such that the promoter is regulated by acetyl phosphate in the absence of nitrogen starvation. The kit can also provide instructions for their use. The nucleic acid sequence can contain a restriction enzyme polylinker located 3xe2x80x2 of the promoter such that a sequence inserted into the polylinker is operably linked to the promoter which is controlled by a response regulator protein. In one implementation of the kit, the promoter is the E. coli glnAp2 promoter and the bacterial host cell is an E. coli cell containing a mutation or deletion of the glnL gene.
Another aspect of the invention features a host cell containing a first expression cassette. The first expression cassette includes a promoter, such as any of those described above, and a nucleic acid sequence encoding an enzyme required for biosynthesis of a heterologous metabolite. As used herein, xe2x80x9cenzymexe2x80x9d refers to a polypeptide having ability to catalyze a chemical reaction or multiple reactions. The nucleic acid sequence is operably linked to the promoter which is regulated by acetyl phosphate in the absence of nitrogen starvation. The host cell also contains additional nucleic acid sequences for expressing other enzymes required for biosynthesis of the metabolite. Such additional sequences may be endogenous sequences expressing endogenous enzymes, or introduced sequences expressing heterologous enzymes.
In one example, the heterologous metabolite is an isoprenoid, a polyhydroxyalkanoate, a polyketide, a xcex2-lactamn antibiotic, an aromatic, or a precursor, e.g., an upstream metabolite, or a derivative, e.g., a downstream metabolite, thereof. For instance, the isoprenoid can be a carotenoid, a sterol, a taxol, a diterpene, a gibberellin, and a quinone. Specific examples of isoprenoids include isopentyl diphosphate, dimethylallyl diphosphate, geranyl diphosphate, farnesyl diphosphate, geranylgeranyl diphosphate, and phytoene. Specific examples of carotenoids include xcex2-carotene, xcex6-carotene, astaxanthin, zeaxanthin, zeaxanthin-xcex2-glucoside, phytofluene, neurosporene, lutein, and torulene. When the desired heterologous metabolite is an isoprenoid, the heterologous enzyme can be isopentenyl diphosphate isomerase, geranylgeranyl diphosphate synthase, or 1-deoxyxylulose 5-phosphate synthase. When the desired heterologous metabolite is an polyhydroxyalkanoate, the heterologous enzyme can be 3-ketoacyl reductase, or poly-3-hydroxyalkanoate polymerase.
The host cell can be a bacterial cell, e.g., an E. coli cell. The host cell is optionally genetically modified by deletion or mutation of a gene, e.g., a gene encoding a histidine protein kinase, as described above. In one specific example, the host cell further contains a second expression cassette containing a nucleic acid sequence encoding phosphoenolpyruvate synthase operably linked to a promoter regulated by acetyl phosphate concentration, e.g., glnAp2.
Another aspect of invention features a method of producing heterologous isoprenoids in a host cell. The method includes overexpressing phosphoenolpyruvate synthase and expressing biosynthetic enzymes required for synthesis of the heterologous isoprenoid. In one implementation, a gene in the host cell encoding a pyruvate kinase or a phosphoenolpyruvate carboxylase is genetically deleted or enfeebled. In another implementation, a gene encoding phosphoenolpyruvate carboxykinase is overexpressed in the host cell. Still another aspect of the invention features a method of producing a lycopene in a host cell. The method includes expressing the following heterologous enzymes: 1-deoxy-D-xylulose 5-phosphate synthase, a geranylgeranyl diphosphate synthase, a phytoene synthase, and a phytoene saturase. In one implementation of this method, an isopentenyl diphosphate isomerase is overexpressed, e.g., using the glnAp2 promoter. In another implementation, a phosphoenolpyruvate synthase is overexpressed, e.g., using the glnAp2 promoter.
Another aspect of the invention features a nucleic acid sequence containing a promoter and a sequence encoding a biosynthetic enzyme required for the production of a first metabolite. The promoter is operably linked to the sequence, and is regulated by a to second metabolite whose concentration is indicative of availability of a precursor for the biosynthesis of the first metabolite. In one example, the second metabolite is a waste product produced from a precursor for the biosynthesis of the first metabolite.
In one implementation, the first metabolite is a polyhydroxyalkanoate, e.g., polyhydroxybutyrate and the nucleic acid sequence encodes a biosynthetic enzyme, e.g., a 3-ketoacyl coenzyme A (coA) reductases, or a poly-3-hydroxyoctanoyl-CoA polymerase. In another case, the first metabolite is a polyketide, a xcex2-lactamn antibiotic, or an aromatic. In a yet another case, the first metabolite is an isoprenoid, e.g., an isoprenoid mentioned herein. The nucleic acid sequence can encode a biosynthetic enzyme required for isoprenoid production, e.g., isopentenyl diphosphate isomerase, geranylgeranyl diphosphate synthase, 1-deoxyxylulose 5-phosphate synthase, phosphoenolpyruvate synthase, farnesyl diphosphate synthase, geranylgeranyl diphosphate synthase, phytoene synthase, phytoene desaturase, or lycopene cyclase. One precursor of isoprenoids can be pyruvate. Pyruvate concentrations are related to acetate and acetyl-phosphate concentrations. Accordingly, in this instance, the second metabolite is acetyl phosphate. The promoter responding to acetyl phosphate can be controlled by a response regulator protein, e.g., a response regulator protein mentioned above. Such a promoter may only respond to acetyl phosphate in a specific host cell. In a particular example, the promoter responding to acetyl phosphate concentration is bound by E. coli ntrC, e.g., E. coli glnAp2 promoter.
The promoter can be regulated by cAMP. The promoter can be a bacterial promoter which binds CAP (catabolite activator protein). In mammals, the promoter can be a promoter containing a cAMP response element (CRE), which binds to the proteins CREB, CREM, or ATF-1. In yeast cells, the promoter can be a promoter regulated by cAMP, or a promoter bound by proteins Gis1, Msn2, or Msn4. Another possible regulatory signal for the promoter can be fructose 1-phosphate, or fructose 6-phosphate. The E. coli FruR protein regulates such promoters.
The nucleic acid sequence can be contained on a plasmid. It can also contain a bacterial origin of replication and a selectable marker. The sequence can further contain a yeast or other eukaryotic origin of replication and appropriate selectable markers, and can be integrated into the genome.
The optimization of biosynthesis of heterologous compounds in host cells is reliant on sensing parameters of cell physiology and on utilizing these parameters to regulate the biosynthesis. One standard techniques in the art is to grow cells and for the user to exogenously add an agent, e.g., an inducer, to turn on genes required for biosynthesis of the desired product. It has been widely observed that high-level induction of a recombinant protein or pathway leads to growth retardation and reduced metabolic activity. (Kurland and Dong (1996) Mol Microbiol 21:1-4). The practice of exogenously supplying an inducer is empirical and does not monitor the availability of resources in the cell for biosynthesis. In contrast, natural pathways rely on feedback mechanisms to control such processes. The combination of certain promoters with specific genetically defined host cells and heterologous polypeptides in this invention unexpectedly results in a highly refined and versatile control circuit that regulates flux to heterologous polypeptide or metabolite synthesis in response to the metabolic state of the cell. Indeed, the dynamically controlled recombinant pathway provides for enhanced production, minimized growth retardation, and reduced toxic by-product formation. The regulation of gene expression in response to physiological state will also benefit other applications, such as gene therapy.
The details of one or more embodiments of the invention are set forth in the description below. Other features, objects, and advantages of the invention will be apparent from the description and from the claims.
The invention provides methods of engineering metabolic control, e.g., methods of utilizing promoters in specific host cells in order to optimize protein expression for either protein production or metabolite synthesis.
A central component of the invention is an expression cassette comprising a promoter and nucleic acid sequence encoding a heterologous polypeptide whose expression is desired. The expression cassette is constructed using standard methods in the art such that the coding nucleic acid sequence is operably linked, e.g., regulated by, the promoter. The promoter is chosen such that the promoter is regulated by a parameter of cell physiology or cell metabolic state. A variety of promoters can be used. In some applications the expression cassette is contained within a plasmid, such as bacterial plasmid with a bacterial origin of replication and a selectable marker. The expression cassette can be integrated into the genome of cells using standard techniques in the arts
If the expression cassette is to be used for engineering regulated production of a heterologous polypeptide during late logarithmic growth or during stationary phase, then the promoter can be chosen accordingly. For example, a promoter can be chosen that responds to small molecule signal, e.g., a second messenger, whose levels accumulate during late logarithmic growth or during stationary phase. The second messenger can be a molecule that accumulates as a precursor, an intermediate, or a waste product of a biochemical pathway. In bacteria, the small molecule signal can be a glycolysis intermediate, e.g., fructose 1-phosphate or fructose 6phosphate or a glycolysis waste product, e.g., acetate or acetyl phosphate. In eukaryotic cells, cAMP concentrations are a well known signal of nutrient state.
The promoter in the expression cassette can be chosen based on the results of a large scale expression analysis experiment, e.g., a gene chip experiment. Genes which are induced by acetyl phosphate can be identified by hybridizing to a microarray labeled cDNA prepared from cells in grown in acetate and comparing the signal to a reference signal, e.g., to the signal of obtained with cDNA prepared from cells in early logarithmic growth. This experiment can be performed on both prokaryotic and eukaryotic cells, e.g., bacterial, yeast, plant and mammalian cells. For an example of such an experiment in a prokaryote, see Talaat et al. (2000) Nat Biotechnol 18:679-82 and Oh and Liao (2000) Biotechnol Prog. 16:278-86. Once a gene is identified which is expressed under the desired condition, its promoter can utilized in the expression cassette. Alternatively, the experiment can be performed by the exogenous addition of a desired molecule (e.g., a precursor in a metabolic pathway) or by manipulation of experimental conditions (e.g., growth to late logarithmic phase or growth while a biosynthetic enzyme is overproduced). Promoters can be identified based on the genes induced.
In one instance, an expression cassette is used for engineering regulated production of a metabolite in a bacterial cell. The promoter can be selected which is regulated by a second metabolite whose concentration is indicative of the availability of a precursor for the biosynthesis of the first metabolite. For example, if the first metabolite is an isoprenoid which is synthesized from the precursors, pyruvate and glyceraldhyde 3-phosphate, then the second metabolite can be acetyl phosphate. In a rich environment, cells produce an excess amount of acetyl-CoA, a product of pyruvate. The excess acetyl-CoA is used to produce ATP and acetate, which is secreted as a waste product. Acetate concentration increases with cell density. Acetate, acetyl-CoA, and acetyl-phosphate concentrations are interrelated by to the following biochemical reactions:
(1) acetyl-CoA+Pi⇄acetyl phosphate+CoA
(2) acetyl phosphate+ADP⇄acetate+ATP
Thus, high acetyl phosphate concentration is indicative of excess acetyl-CoA and excess pyruvate. A host cell which is genetically modified by deletion or mutation of glnL, for example, causes ntrC function to become acetyl phosphate dependent (Feng et al. (1992) J Bacteriol 174:6061-6070). In this fashion, a promoter regulated by ntrC, e.g., the glnAp2 promoter, can be used to control gene expression in response to acetyl phosphate. The glnAp2 promoter can be obtained using standard techniques in the art. For example, primers can be designed and synthesized that anneal to the glnAp2 promoter. The polymerase chain reaction (PCR) can be used to amplify a nucleic acid fragment containing the glnAp2 promoter. This fragment can now be used for further constructions. Likewise, an E. coli strain containing deletion of histidine protein kinase gene, e.g., glnL can be easily prepared. See Link et al. (1997) J Bacteriol. 179(20):6228-6237 for a detailed description of one possible method. The sequences encoding a desired heterologous polypeptide can be cloned downstream of the glnAp2 promoter so that it is operably linked to the promoter. A host cell with an inactivated glnL gene can then be transformed with the sequences. The transformed strain can be grown, and polypeptide production monitored during the course of growth. Robust protein expression can be observed at high cell densities, as in Farrner and Liao (2000) Nat. Biolechnol 18:533-537, the contents of which are hereby incorporated by reference.
A mammalian cell can be used as a host cell for polypeptide or metabolite production. A promoter can be selected, e.g., a promoter that responds to cAMP. Such a promoter can contain a cAMP response element (CRE), which binds to the proteins CREB, CREM, or ATF-1. Using standard techniques in the art, a desired coding sequence can be placed under control of the promoter and transformed into the mammalian cell. In some instances, the construction can be inserted into a virus, e.g., an inactivated virus. Such implementations allow for the regulated production of a protein or a metabolite produced by a heterologous biosynthetic enzyme in a gene therapy scenario. Plant cells can also be used as host cells. Again, an appropriate promoter can be chosen, e.g., a promoter than responds to a plant hormone, metabolite, or a precursor for the production of a desired metabolite. A promoter can be identified by a microarray experiment. After fusion of a desired promoter to a desired coding sequence in an appropriate vector, the construction can be electroporated into Agrobacterium tumefaciens and then used to transform plant cells using standard methods in the art. In still another example, yeast cells can be manipulated to express heterologous polypeptides or metabolites under metabolic control. For example, a Saccharomyces cerevisiae promoter can be a promoter regulated by cAMP, e.g., a promoter bound by proteins Gis1, Msn2, or Msn4. The regulation of all yeast genes in response to a variety of metabolic conditions is increasingly well studied. For example, DeRisi et al. (1997) Science 278:690-686 describe experiments following the transcriptional profile of nearly the entire Saccharomyces cerevisiae gene set under various metabolic conditions. Promoters regulated by a desired metabolite can be selected based on such data. The generation of yeast plasmids and the transformation of yeast are well known in the art.
A variety of metabolic pathways can be reconstructed using the expression techniques described above. For example, a pathway to produce lycopene can be introduced in E. coli by constructing expression vectors for the following genes: dxs (coding for 1-deoxy-D-xylulose 5-phosphate synthase) from E. coli, gps (coding for geranylgeranyl diphosphate (GGPP) synthase) from Archaeoglobus fulgidus, and crtBI (coding for phytocne synthase and desaturase, respectively) from Erwinia uredovora. These genes can reside on a single or multiple plasmids, or can be integrated into the E. coli chromosome. In addition, phosphoenolpyruvate synthase can be overexpressed using any method, e.g., by fusion to the glnAp2 promoter. Isopentyl diphosphate isomerase can be overexpressed using any method, e.g., by fusion to the glnAp2 promoter.
In another example, a pathway to produce polyhydroxyalkanoates (PHA), e.g., polyhydroxybutyrate can be implemented in E. coli. PHA is a family of linear polyesters of hydroxy acids with a variety of thermoplastic properties and commercial uses. Pseudomonas aeruginosa genes encoding 3-ketoacyl coenzyme A reductases and poly-3-hydroxyalkanoate polymerase can be placed under regulation of a desired promoter, e.g., glnAp2, since acetyl-CoA levels can be indicative of precursor availability for PHA synthesis.