The present invention relates to novel methods for the preparation of heterologous proteins.
Many very diverse methods have been tested for the production of recombinant molecules of interest and commercial value. Different organisms that have been considered as hosts for foreign protein expression include single celled organisms such as bacteria and yeasts, cells and cell cultures of animals, fungi and plants and whole organisms such as plants, insects and transgenic animals.
The use of fermentation techniques for large scale production of bacteria, yeasts and higher organism cell cultures is well established. The capital costs associated with establishment of the facility and the costs of maintenance are negative economic factors. Although the expression levels of proteins that can be achieved are high, energy inputs and protein purification costs can greatly increase the cost of recombinant protein production.
The production of a variety of proteins of therapeutic interest has been described in transgenic animals, however the cost of establishing substantial manufacturing is prohibitive for all but high value proteins. Numerous foreign proteins have been expressed in whole plants and selected plant organs. Methods of stably inserting recombinant DNA into plants have become routine and the number of species that are now accessible to these methods has increased greatly.
Plants represent a highly effective and economical means to produce recombinant proteins as they can be grown on a large scale with modest cost inputs and most commercially important species can now be transformed. Although the expression of foreign proteins has been clearly demonstrated, the development of systems with commercially viable levels of expression coupled with cost effective separation techniques has been limited.
The production of recombinant proteins and peptides in plants has been investigated using a variety of approaches including transcriptional fusions using a strong constitutive plant promoter (e.g., from cauliflower mosaic virus (Sijmons et al., 1990, Bio/Technology, 8:217-221); transcriptional fusions with organ specific promoter sequences (Radke et al., 1988, Theoret. Appl. Genet., 75:685-694); and translational fusions which require subsequent cleavage of a recombinant protein (Vanderkerckove et al., 1989, Bio/Technology, 7:929-932).
Foreign proteins that have been successfully expressed in plant cells include proteins from bacteria (Fraley et al., 1983, Proc. Natl. Acad. Sci. USA, 80:4803-4807), animals (Misra and Gedamu, 1989, Theor. Appl. Genet., 78:161-168), fungi and other plant species (Fraley et al. , 1983, Proc. Natl. Acad. Sci. USA, 80:4803-4807). Some proteins, predominantly markers of DNA integration, have been expressed in specific cells and tissues including seeds (Sen Gupta-Gopalan et al., 1985, Proc. Natl. Acad. Sci. USA, 82:3320-3324); Radke et al., 1988, Theor. Appl. Genet., 75:685-694). Seed specific research has been focused on the use of seed-storage protein promoters as a means of deriving seed-specific expression. Using such a system, Vanderkerckove et al., (1989, Bio/Technol., 7:929-932) expressed the peptide leu-enkephalin in seeds of Arabidopsis thaliana and Brassica napus. The level of expression of this peptide was quite low and it appeared that expression of this peptide was limited to endosperm tissue.
It has been generally shown that the construction of chimeric genes which contain the promoter from a given regulated gene and a coding sequence of a reporter protein not normally associated with that promoter gives rise to regulated expression of the reporter. The use of promoters from seed-specific genes for the expression of recombinant sequences in seed that are not normally expressed in a seed-specific manner have been described.
Sengupta-Gopalan et al., (1985, Proc. Natl. Acad. Sci. USA, 82:3320-3324) reported expression of a major storage protein of french bean, called xcex2-phaseolin, in tobacco plants. The gene expressed correctly in the seeds and only at very low levels elsewhere in the plant. However, the constructs used by Sengupta-Gopalan were not chimeric. The entire xcex2-phaseolin gene including the native 5xe2x80x2-flanking sequences were used. Subsequent experiments with other species (Radke et al., 1988, Theor. App. Genet. 75:685-694) or other genes (Perez-Grau, L., Goldberg, R. B., 1989, Plant Cell, 1:1095-1109) showed the fidelity of expression in a seed-specific manner in both Arabidopsis and Brassica. Radke et al., (1988, vide supra), used a xe2x80x9ctaggedxe2x80x9d gene i.e., one containing the entire napin gene plus a non-translated xe2x80x9ctagxe2x80x9d.
The role of the storage proteins is to serve as a reserve of nitrogen during seed germination and growth. Although storage protein genes can be expressed at high levels, they represent a class of protein whose complete three-dimensional structure appears important for proper packaging and storage. The storage proteins generally assemble into multimeric units which are arranged in specific bodies in endosperm tissue. Perturbation of the structure by the addition of foreign peptide sequences leads to storage proteins unable to be packaged properly in the seed.
In addition to nitrogen, the seed also stores lipids. The storage of lipids occurs in oil or lipid bodies. Analysis of the contents of lipid bodies has demonstrated that in addition to triglyceride and membrane lipids, there are also several polypeptides/proteins associated with the surface or lumen of the oil body (Bowman-Vance and Huang, 1987, J. Biol. Chem., 262:11275-11279, Murphy et al., 1989, Biochem. J., 258:285-293, Taylor et al., 1990, Planta, 181:18-26). Oil-body proteins have been identified in a wide range of taxonomically diverse species (Moreau et al., 1980, Plant Physiol., 65:1176-1180; Qu et al., 1986, Biochem. J., 235:57-65) and have been shown to be uniquely localized in oil-bodies and not found in organelles of vegetative tissues. In Brassica napus (rapeseed, canola) there are at least three polypeptides associated with the oil-bodies of developing seeds (Taylor et al., 1990, Planta, 181:18-26).
The oil bodies that are produced in seeds are of a similar size (Huang A. H. C., 1985, in Modern Meths. Plant Analysis, Vol. 1:145-151 Springer-Verlag, Berlin). Electron microscopic observations have shown that the oil-bodies are surrounded by a membrane and are not freely suspended in the cytoplasm. These oil-bodies have been variously named by electron microscopists as oleosomes, lipid bodies and spherosomes (Gurr M I., 1980, in The Biochemistry of Plants, 4:205-248, Acad. Press, Orlando, Fla). The oil-bodies of the species that have been studied are encapsulated by an unusual xe2x80x9chalf-unitxe2x80x9d membrane comprising, not a classical lipid bilayer, but rather a single amphophilic layer with hydrophobic groups on the inside and hydrophillic groups on the outside (Huang A. H. C., 1985, in Modern Meths. Plant Analysis, Vol. 1:145-151 Springer-Verlag, Berlin).
The numbers and sizes of oil-body associated proteins may vary from species to species. In corn, for example, there are two immunologically distinct polypeptide dasses found in oil-bodies (Bowman-Vance and Huang, 1988, J. Biol. Chem., 263:1476-1481). Oleosins have been shown to comprise alternate hydrophillic and hydrophobic regions (Bowman-Vance and Huang, 1987, J. Biol. Chem., 262:11275-11279). The amino acid sequences of oleosins from corn, rapeseed, and carrot have been obtained. See Qu and Huang, 1990, J. Biol. Chem., 265:2238-2243, Hatzopoulos et al., 1990, Plant Cell, 2:457-467, respectively. In an oilseed such as rapeseed, oleosin may comprise between 8% (Taylor et al., 1990, Planta, 181:18-26) and 20% (Murphy et al., 1989, Biochem.J., 258:285-293) of total seed protein. Such a level is comparable to that found for many seed storage proteins.
Genomic clones encoding oil-body proteins with their associated upstream regions have been reported for several species, including maize (Zea mays, Bowman-Vance and Huang, 1987, J. Biol. Chem., 262:11275-11279; and Qu Huang, 1990, J. Biol. Chem., 265:2238-2243) and carrot (Hatzopoulos et al., 1990, Plant Cell, 2:457-467). cDNAs and genomic clones have also been reported for cultivated oilseeds, Brassica napus (Murphy, et al., 1991, Biochem. Biophys. Acta, 1088:86-94; and Lee and Huang, 1991, Plant Physiol 96:1395-1397), sunflower (Cummins and Murphy, 1992, Plant Molec. Biol. 19:873-878) soybean (Kalinski et al., 1991, Plant Molec. Biol. 17: 1095-1098), and cotton (Hughes et al., 1993, Plant Physiol 101:697-698). Reports on the expression of these oil-body protein genes in developing seeds have varied. In the case of Zea mays, transcription of genes encoding oil-body protein isoforms began quite early in seed development and were easily detected 18 days after pollination. In non-endospermic seeds such as the dicotyledonous plant Brassica napus (canola, rapeseed), expression of oil-body protein genes seems to occur later in seed development (Murphy, et al., 1989, Biochem. J., 258:285-293) compared to corn.
A maize oleosin has been expressed in seed oil bodies in Brassica napus transformed with a Zea mays oleosin gene. The gene was expressed under the control of regulatory elements from a Brassica gene encoding napin, a major seed storage protein. The temporal regulation and tissue specificity of expression was reported to be correct for a napin gene promoter/terminator (Lee et al., 1991, Proc. Natl. Acad. Sci. USA, 88:6181-6185).
Thus the above demonstrates that oil body proteins (or oleosins) from various plant sources share a number of similarities in both structure and expression. However, at the time of the above references it was generally believed that modifications to oleosins or oil body proteins would likely lead to abherant targeting and instability of the protein product. (Vande Kerckhove et al., 1989. Bio/Technology, 7:929-932; Radke et al., 1988. Theor. and Applied Genetics, 75:685-694; and Hoffman et al., 1988. Plant Mol. Biol. 11:717-729).
The present invention describes the use of an oil body protein gene to target the expression of a heterologous polypeptide, to an oil body in a host cell. The unique features of both the oil body protein and the expression patterns are used in this invention to provide a means of synthesizing commercially important proteins on a scale that is difficult if not impossible to achieve using conventional systems of protein production.
In particular, the present invention provides a method for the expression of a heterologous polypeptide by a host cell said method comprising: a) introducing into a host cell a chimeric nucleic acid sequence comprising: 1) a first nucleic acid sequence capable of regulating the transcription in said host cell of 2) a second nucleic acid sequence, wherein said second sequence encodes a fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the fusion polypeptide to a lipid phase linked in reading frame to (ii) a nucleic acid sequence encoding the heterologous polypeptide; and 3) a third nucleic acid sequence encoding a termination region functional in the host cell; and b) growing said host cell to produce the fusion polypeptide.
The present invention also provides a method for the production and release of a heterologous polypeptide from a fusion polypeptide associated with a plant oil body fraction during seed germination and plant seedling growth, said method comprising: a) introducing into a plant cell a first chimeric nucleic acid sequence comprising: 1) a first nucleic acid sequence capable of regulating the transcription in said plant cell of 2) a second nucleic acid sequence wherein said nucleic acid second sequence encodes a fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the fusion polypeptide to an oil body, linked in reading frame to (ii) a nucleic acid sequence encoding the heterologous polypeptide and (iii) a linker nucleic acid sequence encoding an amino acid sequence that is specifically cleavable by enzymatic means wherein said linker nucleic acid sequence (iii) is located between said nucleic acid sequence (i) encoding the oil body protein and said nucleic acid sequence (ii) encoding the heterologous polypeptide; and 3) a third nucleic acid sequence encoding a termination region; b) sequentially or concomitantly introducing into the genome of said plant a second chimeric nucleic acid sequence comprising: 1) a first nucleic acid sequence capable of regulating the transcription specifically during seed germination and seed growth of 2) a second nucleic acid sequence encoding a specific enzyme that is capable of cleaving the linker nucleic acid sequence of said first chimeric nucleic acid sequence; and 3) a third nucleic acid sequence encoding a termination region; c) regenerating a plant from said plant cell and growing said plant to produce seed whereby said fusion polypeptide is expressed and associated with oil bodies and d) allowing said seed to germinate wherein said enzyme in said second chimeric nucleic acid sequence is expressed and cleaves the heterologous polypeptide from the fusion polypeptide associated with the oil bodies during seed germination and early seedling growth.
The present invention further provides a method for producing an altered seed meal by producing a heterologous polypeptide in association with a plant seed oil body fraction, said method comprising: a) introducing into a plant cell a chimeric nucleic acid sequence comprising: 1) a first nucleic acid sequence capable of regulating the transcription in said plant cell of 2) a second nucleic acid sequence wherein said second sequence encodes a fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the fusion polypeptide to an oil body, linked in reading frame to (ii) a nucleic acid sequence encoding the heterologous polypeptide and 3) a third nucleic acid sequence encoding a termination region; b) regenerating a plant from said plant cell and growing said plant to produce seed whereby said heterologous polypeptide is expressed and associated with oil bodies; and c) crushing said seed and preparing an altered seed meal.
The present invention yet also provides a method of preparing an enzyme in a host cell in association with an oil body and releasing said enzyme from the oil body, said method comprising: a) transforming a host cell with a chimeric nucleic acid sequence comprising: 1) a first nucleic acid sequence capable of regulating the transcription of 2) a second nucleic acid sequence, wherein said second sequence encodes a fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the fusion polypeptide to an oil body; (ii) a nucleic acid sequence encoding an enzyme and (iii) a linker nucleic acid sequence located between said nucleic acid sequence (i) encoding the oil body and said nucleic acid sequence (ii) encoding the enzyme and encoding an amino acid sequence that is cleavable by the enzyme encoded by the nucleic acid sequence (ii); and 3) a third nucleic acid sequence encoding a termination region functional in said host cell b) growing the host cell to produce the fusion polypeptide under conditions such that enzyme is not active; c) recovering the oil bodies containing the fusion polypeptide; and d) altering the environment of the oil bodies such that the enzyme is activated and cleaves itself from the fusion polypeptide.
The present invention further provides a method for the expression of a heterologous polypeptide by a host cell in association with an oil body and separating said heterologous polypeptide from the oil body, said method comprising: a) transforming a first host cell with a first chimeric nucleic acid sequence comprising: 1) a first nucleic aic sequence capable of regulating the transcription in said host cell of 2) a second nucleic acid sequence, wherein said second sequence encodes a first fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the first fusion polypeptide to a lipid phase linked in reading frame to (ii) a nucleic acid sequence encoding the heterologous polypeptide; and (iii) a linker nucleic acid sequence encoding an amino acid sequence that is specifically cleavable by enzymatic means wherein said linker nucleic acid sequence (iii) is located between said (i) nucleic acid sequence encoding the oil body protein and said (ii) nucleic acid sequence encoding the heterologous polypeptide; and 3) a third nucleic acid sequence encoding a termination region functional in the host cell; and b) transforming a second host cell with a second chimeric nucleic acid sequence comprising: 1) a first nucleic acid sequence capable of regulating the transcription specifically during seed germination and seed growth of 2) a second nucleic acid sequence wherein said second sequence encodes a second fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the second fusion polypeptide to a lipid phase linked in reading frame to a nucleic acid sequence, encoding a specific enzyme that is capable of cleaving the linker nucleic acid sequence of said first chimeric nucleic acid sequence; and 3) a third nucleic acid sequence encoding a termination region; c) growing said first host cell under conditions such that the first fusion polypeptide is expressed and associated with the oil bodies to produce a first oil body fraction containing the first recombinant fusion polypeptide; d) growing said second host cell under conditions such that the second fusion polypeptide is expressed and associated with the oil bodies to product a second oil body fraction containing the second recombinant fusion polypeptide; e) contacting the first oil body fraction of step (c) with the second oil body fraction of step (d) under conditions such that the enzyme portion of the second fusion polypeptide cleaves the heterologous polypeptide from the first fusion polypeptide.
The present invention also provides a chimeric nucleic acid sequence encoding a fusion polypeptide, capable of being expressed in association with an oil body of a host cell comprising: 1) a first nucleic acid sequence capable of regulating the transcription in said host cell of 2) a second nucleic acid sequence, wherein said second sequence encodes a fusion polypeptide and comprises (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein gene to provide targeting of the fusion polypeptide to a lipid phase linked in reading frame to (ii) a nucleic acid sequence encoding a heterologous polypeptide; and 3) a third nucleic acid sequence encoding a termination region functional in the host cell.
The present invention also includes a fusion polypeptides encoded for by a chimeric nucleic acid sequence comprising (i) a nucleic acid sequence encoding a sufficient portion of an oil body protein to provide targeting of the fusion polypeptide to an oil body linked in reading frame to (ii) a nucleic acid sequence encoding a heterologous polypeptide.
The invention further provides methods for the separation of heterologous proteins from host cell components by partitioning of the oil body fraction and subsequent release of the heterologous protein via specific cleavage of the heterologous proteinxe2x80x94oil body protein fusion. Optionally a cleavage site may be located prior to the N-terminus and after the C-terminus of the heterologous polypeptide allowing the fusion polypeptide to be cleaved and separated by phase separation into its component peptides. This production system finds utility in the production of many proteins and peptides such as those with pharmaceutical, enzymic, rheological and adhesive properties.
The processing of a wide variety of materials using enzymes has enormous commercial potential. The present invention provides for methods to produce recombinant enzymes in mass quantities which can be separated from cellular components by partitioning of the oil-body fraction. The enzyme of interest may be cleaved from the oil body protein or may be used in association with the oil-body fraction. Enzymes fused to an oil body protein in an oil-body fraction represent a type of immobilized and reusable enzyme system. Immobilized enzyme systems have been developed in association with various inert support matrices for many industrial purposes including cellulose beads, plastic matrixes and other types of inert materials. Enzymes attached to oil-bodies can be mixed with solutions containing enzyme substrates and subsequently recovered by floatation and partitioning of the oil-body fraction and reused.
In addition to the production and isolation of recombinant proteins from plants, the present invention also contemplates methods for crop improvement and protection. The nutritional quality of seeds has been improved by the addition of proteins with high levels of essential amino acids (DeClercq et al., 1990, Plant Physiol. 94:970-979) and enzymes such as lauroyl-ACP thioesterase from Umbellularia californica that affect lipid composition (U.S. Pat. No. 5,298,421). To date these seed modifications have only been conducted using seed storage gene promoters that may have inherent limitations. Use of oil body protein regulatory sequences provides an additional means by which to accomplish such modifications.
Insect predation and fungal diseases of crop plants represent two of the largest causes of yield losses. A number of strategies dependent on transformation and expression of recombinant proteins in plants have been advanced for the protection of plants from insects and fungi (Lamb et al., 1992, Bio/Technology 11:1436-1445). These strategies are exemplified by the expression of peptide inhibitors of insect digestive enzymes such as cowpea trypsin inhibitor (Hoffman etal., 1992, J. Economic Entomol. 85: 2516-1522) bacterial or arachnid protein toxins (Gordon and Zlotkin, 1993, FEBS Lett., 315:125-128) and the expression of chitinase enzymes for the digestion of fungal cell walls (Broglie et al., 1991, Science 254: 5035, 1194-1197; Benhamou et al., 1993, Plant Journal 2:295-305; Dunsmuir et al., 1993, In Advances in molecular genetics of plant-microbe interactions, Vol2. pp 567-571, Nester, E. W. and Verma, D. P. S. eds.). The use of oil body proteins to localize specific polypeptides that afford crop protection allows one to develop novel strategies to protect vulnerable germinating seeds.
The use of oil body whose expression is limited to pollen allows one to alter the function of pollen to specifically control male fertility. One may use promoter sequences from such oil body to specifically express recombinant proteins that will alter the function of pollen. One such example is the use of such promoters to control the expression of novel recognition proteins such as the self-incompatibility proteins. Additional uses are contemplated including expression of oil body fusion proteins in pollen that are toxic to pollen. Seed specific oil body may be used to alter female fertility.
The methods described above are not limited to heterologous proteins produced in plant seeds as oil body proteins may also be found in association with oil bodies in other cells and tissues. Additionally the methods are not limited to the recovery of heterologous proteins produced in plants because the extraction and release methods can be adapted to accommodate oil body protein-heterologous protein fusions produced in any cell type or organism. An extract containing the fusion protein is mixed with additional oleosins and appropriate tri-glycerides and physical conditions are manipulated to reconstitute the oil-bodies. The reconstituted oil-bodies are separated by floatation and the recombinant proteins released by the cleavage of the junction with oleosin.